Alpha Epsilon Logo
    Dr. Alexander Engelhardt
 engelhardt@alpha-epsilon.de
 0176 5690 6728

The Alpha Epsilon Blog

All things data science

Welcome to my blog! Here, I will publish tutorials related to data science, which will also serve as convenient cheatsheets and references for myself. I tend to learn best when simultaneously organizing and summarizing the material for presentation purposes, so this blog serves me as a learning vehicle, as well.

My projects

This list contains "mother" posts for my larger undertakings, each spanning multiple blog posts.

All posts

All Posts

All posts ordered by newest

DateTitleCategoryTags
17 October 2017 What is Data Science? Freelancing DataScience
08 October 2017 The differences when using Spark with Scala CCA175 Spark, Scala
07 October 2017 Spark SQL with Python CCA175 Spark, SQL
13 September 2017 Filter, aggregate, join, rank, and sort datasets (Spark/Python) CCA175 Python, Spark
07 September 2017 Reading and writing data with Spark and Python CCA175 Spark, Python
29 August 2017 A Data Science Case Study in R R DataScience, CaseStudy
08 August 2017 A basic Spark/Python script CCA175 Spark
05 August 2017 The LaTeX for WordPress plugin and PHP 7.0 / 7.1 Misc LaTeX, MathJaX, Wordpress
30 July 2017 Scala introduction and cheatsheet CCA175 Scala, Spark, Cheatsheet
26 July 2017 Disabling IPv6 on Arch Linux and NetworkManager Linux IPv6, VPN
25 July 2017 Command-line options for spark-submit CCA175 Spark
23 July 2017 My Python Cheatsheet Python Cheatsheet
22 July 2017 How to design a Hadoop architecture BigData Architecture, Hadoop
21 July 2017 Using Sqoop to move data between HDFS and MySQL CCA175 MySQL, SQL, Sqoop
21 July 2017 Spark Streaming BigData Hadoop, Spark, Streaming
21 July 2017 Load data into and out of HDFS using the Hadoop File System commands CCA175 Hadoop
19 July 2017 First Python steps for R users Python
18 July 2017 Sharing Python notebooks on Jekyll Python GitHub, Notebook
18 July 2017 Getting streaming data with Kafka and Flume BigData Flume, Hadoop, Kafka, Streaming
16 July 2017 Preparing for the Cloudera Exam CCA175: Spark and Hadoop Developer CCA175 Hadoop, Spark, Cloudera
13 July 2017 Apache Drill BigData Drill, Hive, MongoDB, Phoenix, Presto
11 July 2017 MongoDB BigData MongoDB, NoSQL
09 July 2017 NoSQL: non-relational databases BigData Hadoop, NoSQL, SQL
09 July 2017 HBase BigData Hadoop, NoSQL
09 July 2017 Cassandra BigData NoSQL
08 July 2017 Hive BigData Hadoop
05 July 2017 An overview of Amazon's AWS services BigData AWS, Cloudera, Hadoop
04 July 2017 Spark BigData Hadoop, Spark
01 July 2017 Pig: An introduction BigData Hadoop
30 June 2017 The Hadoop core: HDFS and MapReduce BigData Hadoop, MapReduce
29 June 2017 The Hadoop ecosystem: An overview BigData Hadoop
28 June 2017 Connect R with Access2007 via RODBC R ODBC, Access
17 June 2017 Dear Recruiters: Freelancing
13 June 2017 Sharing confidential data with nginx and htaccess Linux VPS
11 June 2017 Administrating your own git server Linux
12 April 2017 lFTP usage Linux FTP, Linux
12 April 2017 Getting started with git Programming
12 April 2017 SSH and scp Linux
16 December 2016 diff tips and tricks Linux Diff, Linux
14 December 2016 grep - Tips and Tricks Linux
14 August 2015 Cluster computing on the Sun Grid Engine Programming Cluster computing, Sun Grid Engine
02 May 2014 Awk tips and tricks and Bioinformatics applications Programming awk
08 January 2014 Data analysis Hadley Wickham style R
12 October 2013 Arch Linux on a MacBook Pro 9.2 Linux

All posts by category

Posts in Linux
DateTitleTags
26 July 2017 Disabling IPv6 on Arch Linux and NetworkManager IPv6, VPN
13 June 2017 Sharing confidential data with nginx and htaccess VPS
11 June 2017 Administrating your own git server
12 April 2017 lFTP usage FTP, Linux
12 April 2017 SSH and scp
16 December 2016 diff tips and tricks Diff, Linux
14 December 2016 grep - Tips and Tricks
12 October 2013 Arch Linux on a MacBook Pro 9.2
Posts in R
DateTitleTags
29 August 2017 A Data Science Case Study in R DataScience, CaseStudy
28 June 2017 Connect R with Access2007 via RODBC ODBC, Access
08 January 2014 Data analysis Hadley Wickham style
Posts in Programming
DateTitleTags
12 April 2017 Getting started with git
14 August 2015 Cluster computing on the Sun Grid Engine Cluster computing, Sun Grid Engine
02 May 2014 Awk tips and tricks and Bioinformatics applications awk
Posts in Freelancing
DateTitleTags
17 October 2017 What is Data Science? DataScience
17 June 2017 Dear Recruiters:
Posts in BigData
DateTitleTags
22 July 2017 How to design a Hadoop architecture Architecture, Hadoop
21 July 2017 Spark Streaming Hadoop, Spark, Streaming
18 July 2017 Getting streaming data with Kafka and Flume Flume, Hadoop, Kafka, Streaming
13 July 2017 Apache Drill Drill, Hive, MongoDB, Phoenix, Presto
11 July 2017 MongoDB MongoDB, NoSQL
09 July 2017 NoSQL: non-relational databases Hadoop, NoSQL, SQL
09 July 2017 HBase Hadoop, NoSQL
09 July 2017 Cassandra NoSQL
08 July 2017 Hive Hadoop
05 July 2017 An overview of Amazon's AWS services AWS, Cloudera, Hadoop
04 July 2017 Spark Hadoop, Spark
01 July 2017 Pig: An introduction Hadoop
30 June 2017 The Hadoop core: HDFS and MapReduce Hadoop, MapReduce
29 June 2017 The Hadoop ecosystem: An overview Hadoop
Posts in CCA175
DateTitleTags
08 October 2017 The differences when using Spark with Scala Spark, Scala
07 October 2017 Spark SQL with Python Spark, SQL
13 September 2017 Filter, aggregate, join, rank, and sort datasets (Spark/Python) Python, Spark
07 September 2017 Reading and writing data with Spark and Python Spark, Python
08 August 2017 A basic Spark/Python script Spark
30 July 2017 Scala introduction and cheatsheet Scala, Spark, Cheatsheet
25 July 2017 Command-line options for spark-submit Spark
21 July 2017 Using Sqoop to move data between HDFS and MySQL MySQL, SQL, Sqoop
21 July 2017 Load data into and out of HDFS using the Hadoop File System commands Hadoop
16 July 2017 Preparing for the Cloudera Exam CCA175: Spark and Hadoop Developer Hadoop, Spark, Cloudera
Posts in Python
DateTitleTags
23 July 2017 My Python Cheatsheet Cheatsheet
19 July 2017 First Python steps for R users
18 July 2017 Sharing Python notebooks on Jekyll GitHub, Notebook
Posts in Misc
DateTitleTags
05 August 2017 The LaTeX for WordPress plugin and PHP 7.0 / 7.1 LaTeX, MathJaX, Wordpress