Arató Bence
Managing director, BI Consulting
He leads the research activities of the yearly run BI-TREK and DW-TREK surveys collecting information and user feedback about the local BI&DW. He also teaches several BI and DW related classes for the Hungarian BI Academy.
He writes about business analytics on the BI.hu website since 1998 and tweets as @bencearato
The State of Data
The last few years have been a very intense period in the data world, with numerous new technologies emerging and taking central place in some application. A vast amount of new data sources also became easily available from detailed activity logs to sensor data.
Based on these new data sources and technologies, we are now collecting, using, and analyzing data in new, creative, inspiring and sometimes alarming ways.
The talk gives an overview of the current technology trends in data warehousing and Big Data, and gives a few examples of how these trends affect our everyday lives.
Tóth Zoltán
Tech Lead Data Services, Prezi
Lessons learned building a petabyte-scale data infrastructure
Back in 2011 at Prezi we started off with a single SQL query that worked on a few megabytes of data and produced somewhat accurate numbers satisfying basic business needs. This used to be our BI platform. Today we run a data infrastructure with around 70 high-performance servers that crunch hundreds of gigabytes of data and feed hundreds of reports day by day.
Along this journey we used standard Unix and statistical software, later on-premise Hadoop clusters, NoSQL databases and third-party BI tools. Learning from our mistakes we rebuilt our data infrastructure and ETL systems many times.
I’ll share the successes and misses we encountered throughout this journey with a special focus on our current experiences with managed solutions such as Amazon’s Elastic MapReduce Hadoop solution and Redshift, Amazon’s hosted data warehouse solution.
Luis Moreno Campos
EMEA Big Data Solutions Lead, Oracle
Campaigns, Partner development and Sales Enablement.
Regular speaker at Industry and technology events, CIO roundtables, technology user groups, University
seminars, and marketing events, Luis has international experience in all phases of Big Data projects.
Prior joining Oracle he held roles in consulting, training and business development with expertise in Telco, Financial Services and Media.
Enthusiast about what data driven solutions can do to trigger innovation in every sector as well as the social impact of technology, Luis travels the globe helping businesses innovate and take competitive edge out of Big Data.
When not traveling, Luis is in Lisbon with his family, dogs and friends. Tennis lover, stand-up comedy fan, and passionate about cooking, Luis is also known as a regular blogger, book publisher and a twitter addict!
Big Data at Work
Big Data is a phenomenon. Big Data is the datafication of everything in business, government and even private life.
We are in the very early stages of datafication, and already we’ve seen big changes. But there is a base issue as you know. The world’s ability to produce data has outstripped most companies’ ability to use it.
Companies need not only more processing power to get value out of big data, but a new way of thinking about what value they can get.
To change the business, you have to take the data available to you and figure out what you can learn from it. And as data grows exponentially, you need new technologies to dramatically reduce the time, cost and effort of forming and testing hypotheses. But these two approaches are more powerful together than either alone.
Combining these two approaches into a seamless deployment is Big Data at work.
This session will show how Big Data is forcing top-to-bottom re-evaluation of several industries, and how organisations using Oracle are currently reaping value to the investments made.
Wouter de Bie
Team Lead Data Infrastructure, Spotify
When the dot-com bubble bursted Wouter decided to get a bachelor degree in computer science while working for various companies as a developer and system administrator. After finishing studies he worked at McNolia where he was responsible for their hosting environment and gradually became a project manager.
In 2007 Wouter decided to pursue a more technical career as a freelance developer and technical lead. Late 2008 he co-founded Jewel Labs, a company focussed on developing software for cinema and festival ticketing, where he acted as CTO.
In 2009 Wouter decided to move to Sweden for personal reasons and worked as a Ruby developer and system administrator at Delta Projects, one of Sweden’s biggest online ad serving companies, before he decided to join Spotify in 2011.
Wouter is currently working as a team lead for Data Infrastructure at Spotify where he manages a team of 10 data engineers that take care of building the infrastructure that enables the rest of Spotify to work with data. Next to that, he’s responsible for Spotify’s data architecture.
Using Big Data for fast product iterations to drive use growth
In this talk we’ll look at how Spotify uses data and Big Data technology to make fast iterations on the Spotify product. Some of the questions we’ll try to answer are ”Why is fast product iteration important for us?”, ”How does data tie into this?” and ”What is it we do to achieve this”?
Stephen Brobst
Chief Technology Officer, Teradata
Best Practices in data warehouse architecture
Proper architecture of a data warehouse has a significant impact on the return on investment obtained from its deployment. This seminar provides a taxonomy of data warehouse topologies and discussion of best practices for enterprise data warehouse deployment. Implementation techniques using integrated, federated, and data mart architectures are discussed along with rules of thumb for when and how to implement these structures as required by analytic applications. A framework for understanding cost and value implications of the various approaches will be described.
- Learn about the performance tradeoffs of different data warehouse architectures.
- Learn about the speed-of-delivery tradeoffs between different data warehouse architectures.
- Learn about the cost tradeoffs between different data warehouse architectures.
- Learn how to use business requirements to choose best-of-breed architecture.
Todd Goldman
Vice president for Data Integration, Informatica
Great Data Isn't an Accident, It Happens by Design
Great data isn’t an accident. Great data happens by design. The challenge is that data is becoming increasingly fragmented. The explosion of technologies around the Internet of things and Hadoop, along with the ability of lines of business to purchase their own applications in the cloud, makes managing data and achieving great insights a challenge. This session will focus on:
- The effects of the evolving data landscape
- How achieving great insight happens by integrating across fragmented data boundaries and automating data quality process
- The characteristics of companies that are leading in use of quality information to transform their organizations (vs those that don’t)
Marcin Bednarski
CRM & Retail Controlling Dep. Director, PKO BP
Interactive CRM at PKO Bank
This presentation tells the story of the real time interactive CRM journey of PKO Bank Polski, the leading bank in Poland and CEE. This is a real business case of a company that believes in the power of modern technology while being a traditional Bank with old history. This is the story of how the new approaches merge with traditions and bring value to everybody. The presentation includes also some good examples of innovative banking solutions from the all over the world. The aim is to send a message to the entire banking world – the Banks must change or die because customers expectations are changing.
Stephan Ewen
Ph.D. Student, TU Berlin
The Stratosphere big data analytics platform
Stratosphere is a next-generation Apache licensed platform for Big Data Analysis, and the only one originated in Europe. Stratosphere offers an alternative runtime engine to Hadoop MapReduce, but uses HDFS for data storage and runs on top of Yarn. Stratosphere features a scalable and very efficient backend, architected using the principles of MPP databases, but not restricted to the relational model or SQL. Stratosphere`s runtime streams data rather than processing them in batch, uses out-of-core implementations for data-parallel processing tasks, gracefully degrading to disk if main memory is not sufficient. Stratosphere is programmable via a Java or Scala API similar to Cascading, that include the common operators like map, reduce, join, cogroup, and cross. Analysis logic is specified without the need of linking user-defined functions. Stratosphere includes a cost-based program optimizer that automatically picks data shipping strategies, and reuses prior sorts and partitions. Finally, Stratosphere features end-to-end first class support for iterative programs, achieving similar performance to Giraph while still being a general (not graph-specific) system. Stratosphere is a mature codebase, developed by a growing developer community, and is currently witnessing its first commercial installations and use cases.
Papp Lajos
devops, SequenceIQ
Provisioning hadoop cluster on … anywhere
Installing a Hadoop cluster is far from trivial. In this speech we show our approach, processes and toolsets we use for all the environments where we provision Hadoop. This covers the process of provisioning targeted from a developer laptop, through QA environments, cloud and large production systems running on physical hardware – all done the same way.
We have built a provisioning framework based on Docker and Apache Ambari – and provide a simple REST API to create a cluster of any size on different cloud providers (Amazon EC2, Rackspace), VM’s and physical hardware. Also we will speak about best practices of managing and monitoring Hadoop clusters and dynamic cluster resizing.
Simon Gregory
Director - Business Development & Strategic Alliances - EMEA, Hortonworks
Apache Hadoop, current state and future prospects
Apache Hadoop is moving rapidly just as this session will need to as there’s a lot to cover. I’ll provide a tour of the current Hadoop eco-system, an update on customer adoption methods, what’s being worked on and ultimately what all of that means to you. I’ll also try and cover the latest trends of interactive query, security, data governance and also some of the new areas of interest in projects such as Spark and Storm.
Papp Tamás
BI Architect, Ustream
How we didn't build a traditional DW for viewership numbers at Ustream
Live video platform Ustream has about 100 M live viewers per month. For several years we used a 3rd party tool to gather and display information about viewership metrics (which means breakdown of view numbers by geography, device, etc.). This system was fast and reliable but expensive and not flexible enough for us, so you bet we wanted a replacement instead.
When we started to think about building an own solution based on our multiple TBs of viewership logs per day, we had two choices: one is to build a „traditional” datawarehouse which can give us near-realtime and historical reports as well or the other choice is to build something that… well, not that expensive („do more with less…” as managers like to put it).
In Oct 2013 we’ve had no idea about LAMBDA architectures at all, but we designed and built a similar architecture to that which is (relatively) cost-effective by using Redis key-value store, Elastic Map Reduce, MySQL and Tableau. Does it have disadvantages compared to a near-realtime datawarehouse? Sure it has. In this case study talk we’ll go through the whole solution and check what compromises we had to make to reach our goal.
Dr. Lóránd Balázs
Szenior üzleti konzultáns, T-Systems
Balázs Lóránd holds a PhD in Economics from the University of Pécs, Hungary and has published several research articles with anlytical results in the previous years.
Location-based Mobile Advertising at Magyar Telekom
The greatest challenge in connection to Big Data solutions is the real-time processing and utilization of large amounts of data. The tools available – which are able to send instant advertising offers right after data-analysis – provide several opportunities in connection with the realization of location-based mobile advertising. In this presentation I will outline a project in which we have developed a system of sending real-time, location-based offers in collaboration with several partners (TSI, EMC, MI6App, Magyar Telekom).
Fábián Zsolt
Database/Security Engineer, Spil Games
Experiencing migration among Map/Reduce platforms
SpilGames is a leading publisher of HTML5, Flash and mobile games. Our main revenue driver is advertising where our system is heavily relying on Big-Data processing. This session will explain how our map/reduce systems were maturing and how we took the challenge of migrating Python based map/reduce Disco jobs to HIVE. The talk will tell the story of developers who was involved in the transition between Disco & Hadoop and how we kept our business online while we changed tires in the box street.
Alex Dean
Co-founder, Snowplow Analytics Ltd
At Snowplow Alex is responsible for Snowplow’s technical architecture, stewarding the open source community and evaluating new technologies such as Amazon Kinesis. Prior to Snowplow, Alex was a partner at technology consultancy Keplar, where the idea for Snowplow was conceived. Before Keplar Alex was a Senior Engineering Manager at OpenX, the open source ad technology company.
Alex lives in London, UK.
Continuous data processing with Kinesis at Snowplow
Since its inception, the Snowplow open source event analytics platform (https://github.com/snowplow/snowplow) has always been tightly coupled to the batched-based Hadoop ecosystem, and Elastic MapReduce in particular. With the release of Amazon Kinesis in late 2013, we set ourselves the challenge of porting Snowplow to Kinesis, to give our users access to their Snowplow event stream in near-real-time.
With this porting process nearing completion, Alex Dean, Snowplow Analytics co-founder and technical lead, will share Snowplow’s experiences in adopting stream processing as a complementary architecture to Hadoop and batch-based processing.
In particular, Alex will explore:
- “Hero” use cases for event streaming which drove our adoption of Kinesis
- Why we waited for Kinesis, and thoughts on how Kinesis fits into the wider streaming ecosystem
- How Snowplow achieved a lambda architecture with minimal code duplication, allowing Snowplow users to choose which (or both) platforms to use
- Key considerations when moving from a batch mindset to a streaming mindset – including aggregate windows, recomputation, backpressure
Gábor Zoltán
Research Engineer, Falkstenen AB Hungarian Branch Office
Managing Financial Big Data on Hadoop
Falkstenen AB Hungarian Branch Office (formerly known as GusGus Capital LLC) is a subsidiary of a family office that is trading financial assets in different asset classes (e.g. ForeignExchange, Equity, Commodities, and their derivatives). We are processes large amount of exchange generated data using Hadoop technologies.
The exponential growth, the lack of complicated structures of the data and the huge number of records makes traditional databases unsuitable (unnecessary) to store and process market data. Hadoop and it’s ecosystem fits naturally into the typical processing scheme of this field.
In this talk we will present our (Big!) data and the problems that we had before moving to Hadoop. Insight into the structures of exchange generated events will be given.
The hardware architecture of our 480 node Hadoop cluster will be shown. You will also learn something about our ETL and the tools that we have built or integrated to handle hundreds of TBs of data.
By the end of this talk you will know something about the problems that we are facing today, and the future development plans that we have as well.
Christoph Boden
Research Associate, TU Berlin
Cooccurrence-based recommendations with Mahout, Scala & Spark
This talk will give a preview to the latest developments and future plans in Apache Mahout. Mahout features a new scala DSL for linear algebraic computations. Programs written in this DSL will be automatically parallelized and executed on Apache Spark. I will give an introduction to the DSL and show how Mahout uses it to implement a cooccurrence-based recommender system.
Enrico Berti
UI Engineer, Cloudera
Open up interactive big data analysis for your enterprise
Hadoop brings many data crunching possibilities but also comes with a lot of complexity: the ecosystem is large and continuously changing, interactions happens on the command line, interfaces are built for engineers…
This talk describes how Hue can be integrated with existing Hadoop deployments with minimal changes/disturbances. Enrico covers details on how Hue can leverage the existing authentication system and security model of your company.
This talk describes through an interactive demo and dialog based on open source Hue how users can get started with Hadoop. We will detail how one can start or use an existing Hadoop cluster to setup Hue. The best practices about how to integrate your company directory and security will be shared. Moreover, the underlying technical details about interact with the ecosystem.
The presentation will continue with real life analytics business use cases. It will show how data can be imported and loaded into the cluster for then being queried interactively with SQL or a search dashboard. All through your Web Browser!
To sum-up, attendees of this talk will learn how Hadoop can be made more accessible and why Hue is the ideal gateway for quickly getting started or using the platform more efficiently.
Balassi Márton
Big Data Solutions Developer, MTA - SZTAKI
Currently he is mainly working on designing and implementing an an open-source, low-latency distributed stream processing framework called Stratosphere Streaming.
Challenges of real-time distributed stream processing
Real time - i.e. low latency - processing is one of the main challenges of the big data community. A variety of frameworks have been proposed for distributed stream processing including S4, Storm, Spark Streaming or Samza, all trying to respond in their own way.
The talk draws a sketch of the current stream processing scene supported by examples from the topic of recommender systems. Demonstration of the application of stream processing and its possible role to complement batch processing is highlighted.
The context of the research enabling this talk was the planning phase of a streaming framework augmenting the European big data analytics paltform, Stratosphere. Stratosphere Streaming aims supporting both very low latency processing and high-throughput minibatch processing in a fault-tolerant fashion. The basic stream processing engine is the contribution of the Budapest team.
Sander Kieft
Manager Core Services, Sanoma Media
Sander has been working with large scale data in media for 15 years and working for the largest websites in The Netherlands as a developer, architect and technology manager.
Does Big Data self service scale as well as Hadoop?
This talk will take you through the past, present and future of the data platform in use by Sanoma. A few years ago Sanoma set out to build a self service data platform, using a mixture of open source and commercial technology. Centre to the platform are Hadoop, Hive and Python. But nowadays the platform consist of real time data ingestion, real time data processing and integration with other enterprise BI and Data systems.
Central question: How did this platform came to be and does Big Data self service deliver on the promise of freeing the for everyone to use?
Kasler Lóránd Péter
Tech Lead Architect, Virgo Systems
When he is not coding or architecting he spends all his time with his 2 year old daughter.
Building a recommendation engine using the Lambda Architecture
We will present our hybrid recommendation engine based on collaborative filtering and text retrieval techniques.
Our goal is to present a valid use case of the Lambda Architecture and how did we tailor it to our needs. We will show how are we leveraging Hadoop and related technologies (Avro, Pail, Cascading, Mahout, SOLR) to bring a highly scalable and customisable recommendation platform to our customers and pave the way for further applications of big data. We will share our learnings of deploying Cloudera Hadoop stack on Amazon Web Services and also the shortcomings of the cloud based approach we have found so far. Also highlight our usage of Amazon Auto Scaling Groups together with key parts of our platform.
Besides the technological part we will also present our novel usage of collaborative filtering (powered by Mahout) combined with SOLR search engine, so that we can cover a wide range of recommendation related needs.
Claudio Martella
PhD candidate, VU University Amsterdam
Apache giraph: large-scale graph processing on hadoop
We are surrounded by graphs. Graphs are used in various domains, such as the Internet, social networks, transportation networks, bioinformatics etc. They are successfully used to discover communities, to detect frauds, to analyse the interactions between proteins, to uncover social behavioral patterns. As these graphs grow larger and larger, no single computer can timely process this data anymore. Apache Giraph is a large-scale graph processing system that can be used to process Big Graphs. Giraph is part of the Hadoop ecosystem, and it is a loose open-source implementation of the Google Pregel system. Originally developed at Yahoo!, it is now a top top-level project at the Apache Foundation, and it enlists contributors from companies such as Facebook, LinkedIn, and Twitter. In this talk we will present the programming paradigm and all the features of Giraph. In particular, we focus on how to write Giraph programs and run them on Hadoop.
Dionysios Logothetis
Associate Researcher, Telefonica Research
Grafos.ML: Tools for large-scale graph mining and machine learning
Large-scale graph mining and machine learning is becoming an increasingly important area of big data analytics with applications from Online Social Network analysis to recommendations. In this talk, I will describe grafos.ml, an umbrella project with the goal of building tools and systems for graph mining and ML analytics.
In the first part of the talk, I will describe Okapi, an open source library of graph mining and ML algorithms built on top of the Giraph graph processing system. The goal of Okapi is to provide a rich toolkit of graph mining algorithms that will simplify the development of applications, such as OSN analysis at scale.
In the second part, I will talk about RT-Giraph, a system for mining large dynamic graphs. In many real-world scenarios, graphs are naturally dynamic and several applications, such as sybil detection in OSNs, require real-time updates upon changes in the underlying graph. However, existing graph processing systems are designed for batch, offline processing, making the analysis of dynamic graphs hard and costly. RT-Giraph is explicitly designed for dynamic graphs, allowing fast updates and making the deployment of real-time applications easier.
Balogh György
CTO, LogDrill
Introduction to Modern Big Data Technologies
Big Data platforms such as Hadoop has evolved and got mature in recent years. We provide a brief history of Big Data evolution focusing on reasons of the current paradigm shift in data processing.
Next we present the latest open source technologies and their capabilities such as Hadoop 2.0, Cloudera Impala and Apache Spark.
Finally we show how these technologies compare to traditional data warehouse systems.
Domaniczky Lajos
Independent Expert
After some side-projects I have specialised in UI technologies and mobile devices (JavaScript and Android UI’s).
I have also submerged myself in big data technologies for a big multinational IT company, enabling them to implement a sustainable system and to align their KPIs to match management expectations, using all of the technologies that we have discussed in this presentation.
Analyzing Big Data using Hadoop and Hive
Writing map/reduce programs using Hadoop to analyze your Big Data can get complex and cumbersome. Hive can help make querying your data easier: a data warehouse system that facilitates easy data summarisation, ad-hoc queries, and the analysis of large datasets stored in Hadoop compatible file systems. Hive provides a mechanism to project structure onto this data and query the data using HiveQL, an SQL-like language.
This presentation will show you how to get started with Hive and HiveQL. We'll discuss Hive's various file- and record formats, partitioned tables, etc. We'll start with a simple data-set stored in Hadoop HDFS: a set of files with a well defined structure, and show how to map these files to a scheme using DDL, as well as how to query this scheme using some useful HiveQL queries.
Dr. Horváth Gábor
Professional Services Manager, Teradata
Data Warehousing in the Big Data era
In the traditional data intense industries, like banking or telecommunications datawarehouses are established long time ago. These DW systems evolved over time, however they face a number of problems to solve, in order to effectively support the decision making of these companies. On top of these problems, the IT/DW organizations are also facing the changes of the data managament industry, and should cope with the technology and process related requirements due to the emerging ”Big Data” paradigm.
This presentation attempts to provide a summary of the status that typical Hungarian telco companies or banks are in, and suggests an approach that would on one hand address some of the issues related with the current DW systems and starts the journey to gain business benefits of ”Big Data”, on the other.
Some real-life, practical examples will also be shown to support this approach.
Luis Moreno Campos
EMEA Big Data Solutions Lead, Oracle
Campaigns, Partner development and Sales Enablement. Regular speaker at Industry and technology events, CIO roundtables, technology user groups, University seminars, and marketing events, Luis has international experience in all phases of Big Data projects.
Prior joining Oracle he held roles in consulting, training and business development with expertise in Telco, Financial Services and Media.
Enthusiast about what data driven solutions can do to trigger innovation in every sector as well as the social impact of technology, Luis travels the globe helping businesses innovate and take competitive edge out of Big Data.
When not traveling, Luis is in Lisbon with his family, dogs and friends. Tennis lover, stand-up comedy fan, and passionate about cooking, Luis is also known as a regular blogger, book publisher and a twitter addict!
Data First Framework - How to build a Big Data architecture
Data First approach is a fundamental shift in data management mindset from Model First approach: Two environments, one difference
Organize data to do something specific (Run the Business) and figure out what the data can do for you (Change the Business). In this session we will learn the architectural components and implications to this new approach.
Nagy Zoltán
Senior Analyst, Data Solutions
Big methods on not-so-big data - a telco churn case study
Big data methods could be useful even without big data.
This is a lesson learnt by a telco willing to handle MNP churn. Traditional methods leave such migrating customers unidentified and beyond reach but a fresh perspective can turn the tide altogether. Big data methods developed mainly for clickstream and weblog analysis has been applied on – many times previously completely unutilized – data of a traditional data warehouse.
As a result, more precise and up-to-date target group selection could be archived. Timing remains a crucial factor though. Results also suggest that further developments could be exploited by including traditional churn prediction methods and also by utilizing ”real” big data.
Borsodi Szilárd
Senior BI/DW consultant, T-Systems
Small data: the burden of manual DWH input
There is a common feature of all data warehouse projects: manual and disaparate sources of business data – all missing from core systems. If the project fails on comfortably accomodating these data into the BI framework then the solution delivered will less likely stand the test of sympathy for the analysts and operators. This talk will give an overview of the real-world issues and possible solutions.
Pocsarovszky Károly
Research manager, eNet
Big Data in Mobile Network Analysis
We are going to present the highlights and key findings of working with Big Data in a mobile network environment. Our project aim was to identify the mobile cells’ data transfer capacity and load by measuring a given set of mobile towers’ activity only through the air interface. The estimated raw data is more than 3 TB. In order to analyze such amount of data, a proper processing framework is needed. We will present our experiences with different Big Data tools (Hadoop, MongoDB, low level scripting) to show how they are compete with the traditional approaches and also how they are complement each other.
Biró Attila
Üzleti megoldások igazgató, Areus
Top 3 adatintegrációs probléma a nagyvállalatoknál
Az olyan nagyvállalatok (pl.: bankok, biztosítók, telco-k, állami és egyéb szervezetek), amelyek kiterjedt és heterogén alkalmazás-környezettel valamint nagyméretű adatbázisokkal rendelkeznek, általában komoly nehézségekkel találják szembe magukat az alábbi esetekben:
- Adatok migrációja és az adatminőség biztosítása: Új vállalati alkalmazások bevezetése, alkalmazások frissítése vagy alkalmazás példányok konszolidációja során az adatmigráció a teljes projekt ráfordításnak tipikusan akár 30-40%-át is jelentheti. A vállalatok gyakran alulbecsülik a projekt terjedelmét, ugyanakkor egy migrációs projekt átlagosan kb. 10-szerese egy adattárház projektnek. Ezek a problémák sok esetben ahhoz vezetnek, hogy a cél rendszerbe rossz adatok kerülnek, a migrációs projekt csúszik, vagy nem tudja teljesíteni az elvárásokat.
- Kisméretű, kompakt tesztkörnyezetek kialakítása. A nagyvállalatoknál a tesztelésre fenntartott rendszerek mérete körülbelül 5-7-10-szerese az éles adatbázisok méreteinek. Ennek oka, hogy általában több szinten folyik a tesztelés és a legtöbb vállalat teljes másolatokkal dolgozik. A tesztkörnyezetek ily módon hatalmas helyet foglalnak, tovább tart az előállításuk, és komplikáltabb karbantartani őket.
- Tesztadatok hatékony védelme: Még a legfejlettebb cégek esetében is a teszteléshez használt adatok az éles rendszerből származó valódi, éles ügyfél adatok. Ez több szempontból is problémát jelent: egyrészt jogszabályokba ütközik, valamint adatlopás és adatszivárgás lehetősége hatványozottan fennáll. Emiatt magas az ügyfél és presztízsvesztés kockázata is.
Fenti problémákra bemutatásra kerülnek a professzionális Informatica megoldások, amelyek több magyarországi nagyvállalatnál már sikeresen bizonyítottak. Magyarországon az Areus képviseli az Informatica-t, tehát a helyi szakértelem, bevezetési és projekttapasztalat is rendelkezésre áll.
Daume Zénó
Alkalmazás-támogatás osztályvezető, Erste Bank
Tesztadat kezelési kihívások és trendek az Erste bankban
Az előadás az Erste Bank tesztadat menedzsment tapasztalatairól fog szólni, azokról az adatkezelési és környezeti kihívásokról, amelyekkel a nagyvállalatoknak rendszerint szembe kell nézni. Az előadás emellett az eddig bejárt út tapasztalataiból építkezve megvilágít egy előremutató jövőképet is a témába vágóan.
Otti Levente
Emarsys
Emarsys Technologies - Data Warehouse As a Service
Az Emarsys Technologies a kezdeti Email Service Provider-ből vált a Marketing Automatizáción keresztül egy teljesen integrált Customer Engagement Platform szolgáltatóvá SaaS megoldással. Ez lefedi a marketinghez kapcsolódó döntések adatvezérelt támogatását, a döntések egyszerű végrehajtását és azok visszamérését, a marketing kampányok követését is.
Az előadás során körvonalazzuk egy szolgáltatásként működő adattárház felépítését, kihangsúlyozva a fejlesztés és üzemeltetés során felmerült kihívásokat, ebből összegyűlt tapasztalatokat, tanulságokat, elsősorban az adatintegráció és tudás kinyerés kérdéseit körbejárva
Gollnhofer Gábor
Adattárház üzletágigazgató, Jet-Sol
A The Data Warehouse Institute (TDWI) és az Association for Computing Machinery (ACM) tagja, Certified Data Vault Data Modeler.
Bevezetés a DW automatizálásba
Az előadás az adattárház automatizálást járja körül. Bemutatja, hogy mit és miért és hogyan érdemes, illetve nem érdemes automatizálni.
Csonka Zoltán
Adattárház architekt, Generali Biztosító
Adattárház automatizálási tapasztalatok a Generali Biztosítóban
Az előadás egy az Oracle Warehouse Builder-el töltött adattárház betöltő folyamatainak automatikus fejlesztésére fókuszál.
A bemutató lefedi, hogy:
- miért döntöttünk a fejlesztési folyamatok automatizálásáról, miért egyedi megoldást választottunk
- a folyamatokat milyen mértékig érni meg automatizálni, hol lehet ezzel nyerni
- kihívások és buktatók a projekt során
- a tapasztalatok alapján mit csinálnánk másképpen
Csippán János
IT Director, Partner in Pet Food
KKV adattárház
Az előadás a Partner in Pet Food csoport regionális adattárházának kiépítése kapcsán szerzett tapasztalatokat mutatja be. Kitér arra, hogy mik a KKV adattárházas sajátosságok, mennyiben hasonlít és mennyiben tér el a „klasszikus” nagyvállalati adattárházaktól.
Az előadásban bemutatom a PPF-ben kialakított megoldás fontosabb elemeit és hogy miért ilyen megoldást választottunk.
Dr. Nizalowski Attila, Fekszi Csaba
Vezető főtanácsadó, NFM; Ügyvezető, Omnit
Jogász, informatikai szakjogász és kodifikációs szakjogász. 1990-től az ELTE Állam- és Jogtudományi Karán, Jogi Továbbképző Intézetében és Rektori Hivatalában dolgozott oktatóként és informatikusként. A 2000-es években jogszabály-nyilvántartókat fejlesztett, leghosszabb ideig a piacvezető Complex Jogtár felelős szerkesztőjeként. 2011-től a Nemzeti Fejlesztési Minisztériumban közbeszerzési alkalmazásokat tervez, ezek elkészítésében projektvezetőként is részt vesz.
Fekszi CSaba
40 éves adattárház- és BI szakértő. A KKVMF informatika szakának és a Budapesti PSZF számvitel szakának elvégzése után a Veszprémi egyetemen szerzett mesterfokozatot informatikából. Pályája során számos, főleg banki és pénzügyi folyamatokat kiszolgáló rendszer (bankkártya rendszerek, internetes fizetés, banki adattárházak és BI megoldások, alkalmazások, portálok) tervezésében és fejlesztésében vett részt. 2007-ben alapította az Omnit Solutions Kft-t, melynek napjainkig is ügyvezetője és többségi tulajdonosa.
Közbeszerzési adattárház open source alapokon
Adattárház és Üzleti intelligencia rendszer kiépítése a kormányzati informatikában, open source alapokon.
Érdemes-e megpróbálni, lehet-e egyáltalán, akár ipari mércével mérve is komoly rendszert építeni ingyenes eszközök használatával? Milyen előnyökkel jár és milyen hátrányokkal járhat egy ingyenes termék használata? Mi van a nyúlon túl? Mi szükséges a terméken kívül egy sikeres DW/BI projekt befutásához?
Kővári Attila
Ügyvezető, BI Projekt
Adattárházak minőségbiztosítása
Az előadás az adattárházak minőség biztosításról fog szólni és olyan kérdésekre keresi a választ, mint a miért van szükség adattárház minőségbiztosításra, hogyan minőségbiztosítsuk az adattárházakat, vagy a mennyit költsünk minőségbiztosításra.
Rékasi László, Szücs Imre
Dataminer/Analyst, Erste; Head of Research and Development, United Consult
Imre is acting as a director at United-Consult Ltd., one of the leading Hungarian consultancy company. He has got more than 10 years experience in business intelligence and data mining. Before starting his consultant career he has worked in the financial and FMCG sector.
Imre has got a strong academic background, he holds MSC in Physics, Astronomy and Computer Science and he is also a PhD candidate at Eötvös Loránd University.
Ügyfélviselkedés elemzése szociális hálók segítségével az Erste Banknál
Az ERSTE Bank Magyarország Zrt. egy alternatív megoldást keresett az ügyfélviselkedés elemzésére, miután a tradícionális BI megoldások nem támogatták megfelelően a közel valós idejű, többcsatornás marketing tevékenységek igény szerinti kialakítását. Egy pilot projekt keretében az OrientDB - NoSQL adatbázis kezelő - és egy teljes szociális hálózat elemzési keretrendszer került kialakításra, melyben számos technológiát integráltak - mint Gephi, R, Gremlin, ... - a cél elérése érdekében. Az előadás során az ERSTE Bank képviselője fogja bemutatni a projekt üzleti hátterét, míg a United Consult a gráfkezelő alapú analitikus rendszer technológiai aspektusait ismerteti.
Kóspál Eszter Sára
Adattárház szervező- és modellező, CIB Bank
Részt vett a meglévő adattárház továbbfejlesztésében, majd 2006-tól a CIB Bank új adattárházának modell alapú bevezetésében. 2013-tól az adattárház szervezői csapatának szakmai vezetője.
Adatmodellezés a gyakorlatban
Az előadás a adatmodellezés mindennapok gyakorlatában felmerülő kérdésekkel és problémákkal foglalkozik.
Az érintett témák között szerepelni fog többek között:
- Hogyan lesz adatmodellje egy alkalmazásnak/adattárháznak?
- Vásárolt kontra saját fejlesztésű adatmodellek
- Lokalizációs technikák és logikák
- Minőségbiztosítás az adatmodellezésben