Hadoop • Distributed Systems • Linux • Hbase • Java • Hive • Mapreduce • Open Source • Big Data • Spring • Enterprise Software • Software Development • Apache Pig • Databases • Flume • High Availability • Maven • System Administration • Git • Data Analysis • Perl • Operating Systems • Public Speaking • Data Mining • Concurrent Programming • Parallel Programming • Zookeeper • Sqoop • Cloudera Enterprise • Data Processing • Linux System Administration • Messaging • Pig • Messaging Systems • Backend Development • Team Management • Product Development • Dynamic Languages • Apache Spark • Apache Zookeeper • C++ • Spring Framework • Architectures • C • Avro • Integration • Architecture • Open Source Software
Languages
English
Interests
Cloudera • Data Mining and Analysis • Hdfs • Distributed Systems • Linux • Data Science • Hadoop Operability • Open Source • Hive (Computing) • Grid Computing • Music • Hbase • Software Design and Architecture • Statistics (Academic Discipline) • Flume • Computer Programming • Big Data • Apache Hadoop • Philosophy • Software Engineering • Distributed • Open Source Software • Database Systems • Mapreduce • Pig (Software)
- San Francisco CA, US Eric Sammer - San Francisco CA, US Kristal Curtis - San Francisco CA, US Nghi Nguyen - Union City CA, US
International Classification:
G06N 5/04 G06F 17/30
Abstract:
As described herein, a portion of machine data of a message may be analyzed to infer, using an inference model, a sourcetype of the message. The portion of machine data may be generated by one or more components in an information technology environment. Based on the inference, a set of extraction rules associated with the sourcetype may be selected. Each extraction rule may define criteria for identifying a sub-portion of text from the portion of machine data of the message to produce a value. The set of extraction rules may be applied to the portion of machine data of the message to produce a result set that indicates a number of values identified using the set of extraction rules. Based on the result set, at least one action may be performed on one or more of inference data associated with the inference model and one or more messages.
Dynamic Query Processor For Streaming And Batch Queries
- San Francisco CA, US Joseph Gabriel Echeverria - San Francisco CA, US Eric Sammer - San Francisco CA, US
International Classification:
G06F 17/30
Abstract:
Operational machine components of an information technology (IT) or other microprocessor- or microcontroller-permeated environment generate disparate forms of machine data. Network connections are established between these components and processors of data intake and query system (DIQS). The DIQS conducts network transactions on a periodic and/or continuous basis with the machine components to receive disparate data and ingest certain of the data as entries of a data store that is searchable for DIQS query processing. The DIQS may receive queries to process against the received and ingested data via an exposed network interface. In one example embodiment, the DIQS receives a query identifying data to be processed, dynamically generates a query processing scheme based on the state of the data to be processed, such as streaming or at rest, and dynamically communicates the query processing scheme to a query executor based on the state of the data to be processed.
Cloudera - San Francisco Bay Area since Jan 2013
Engineering Manager
Cloudera - San Francisco Bay Area Mar 2010 - Jan 2013
Principal Solutions Architect
Domdex Oct 2009 - Mar 2010
Software Engineer
Conductor, Inc. Sep 2008 - Oct 2009
System Architect
stickK.com Feb 2008 - Sep 2008
Director of Technical Operations
Skills:
Hadoop Distributed Systems Linux Hbase Java Hive Mapreduce Open Source Big Data Spring Enterprise Software Software Development Apache Pig Databases Flume High Availability Maven System Administration Git Data Analysis Perl Operating Systems Public Speaking Data Mining Concurrent Programming Parallel Programming Zookeeper Sqoop Cloudera Enterprise Data Processing Linux System Administration Messaging Pig Messaging Systems Backend Development Team Management Product Development Dynamic Languages Apache Spark Apache Zookeeper C++ Spring Framework Architectures C Avro Integration Architecture Open Source Software
Interests:
Cloudera Data Mining and Analysis Hdfs Distributed Systems Linux Data Science Hadoop Operability Open Source Hive (Computing) Grid Computing Music Hbase Software Design and Architecture Statistics (Academic Discipline) Flume Computer Programming Big Data Apache Hadoop Philosophy Software Engineering Distributed Open Source Software Database Systems Mapreduce Pig (Software)
Languages:
English
Youtube
DC_THURS on Streaming Data Systems w/ Eric Sa...
Eric is an OG data engineer and one of most knowledgeable people on th...
Duration:
56m 15s
Keynote: The Evolution of Data Infrastructure...
Over the past few years almost all data processing has moved from batc...
Duration:
38m 3s
Building a practical real-time data platform ...
In this talk, we'll cover how you can build a real-time platform that ...
Duration:
25m 15s
Cloudera - Hadoop Application Development Mad...
ABOUT DATA COUNCIL: Data Council ( ) is a community and conference ser...
Duration:
40m
Trailer for Eric Sammer (Decodable) at #rtasu...
ABOUT STARTREE When you hear decision maker, it's natural to think, C-...
Duration:
27s
Decodable Is Making It Easier For Developers ...
Decodable is a relatively new, over-a-year-and-... focussed on making...
Duration:
20m 13s
Googleplus
Eric Sammer
Lived:
San Francisco, CA
Work:
Cloudera - Engineering Manager
About:
Solution Architect and Training Instructor @ Cloudera. Focused on open source data collection, processing, and reporting systems.