Test-Driven Development (TDD)

“Test-driven development (TDD) is a software development process that relies on the repetition of a very short development cycle: requirements are turned into very specific test cases, then the software is improved to pass the new tests, only. This is opposed to software development that allows software to be added that is not proven to … Read more

REST Calls with Postman

How to set up Postman In order to test REST calls one tool has emerged over the last few years: Postman The following 2-minute-video is great, great summary of how to set it up quickly: [embedded content] I do not want do get into the details. If you need them, you can find them here: … Read more

SQL Server Advanced

Sometimes you have a series of stored procedures that themselves are managed by another master stored procedure like so. This usually just means a series of EXEC statements after each other. It is quite handy to create a login event after each stored procedure call in order to check its progress. Logging with try-catch CREATE … Read more

Azure Functions

This blogpost will demonstrate how to set up Azure Functions with some Python code. More precisely, it will show how to call an Azure Function, add a parameter that specifies the name of the file that we want to read from and store that information in a database. Between reading and storing we have the … Read more

RStudio Addin

If you want to create your own RStudio addins, all you need to do is: Create an R package Create some R functions Create a file at inst/rstudio/addins.dcf Links 1. Create am R Package Set up tools for package development library(devtools) library(roxygen2) # getwd() # setwd(“path/to/repo”) Create Package I am mainly following: https://hilaryparker.com/2014/04/29/writing-an-r-package-from-scratch/ create(“rstudio_addin”) This … Read more

R Travis

In this post we will explore how to set up R package development on github focusing on implementing an automatic Travis and codecoverage check. I set up a sample repo that will include a very basic configuration: TravisR Travis is a great You can easily sign up by connecting your github account: You will need … Read more

Python Cheatsheet

The following gives an overview of the most basic concepts found in python. It serves as a quick reminder when not having coded in python for a while. Source: Download text file or Fork me on GitHub Main if __name__ == ‘__main__’: main() List <list> = <list>[from_inclusive : to_exclusive : step_size] <list>.append(<el>) <list>.extend(<collection>) <list> += … Read more

Airbnb Knowledge Repo

The Knowledge Repo project is focused on facilitating the sharing of knowledge between data scientists and other technical roles using data formats and tools that make sense in these professions. It provides various data stores (and utilities to manage them) for “knowledge posts”, with a particular focus on notebooks (R Markdown and Jupyter / IPython … Read more

Power BI

Using Power BI and R Tutorial here: Run R scripts in Power BI Desktop The only twist that I want to add is an idea on how to enable users without admin access to run R code. This can be achieved by storing a portable r installation on a mountable file storage. R Download the … Read more

Python Virtual Environment

Conda How to set up a virtual environments using conda for the Anaconda Python distribution A virtual environment is a named, isolated, working copy of Python that that maintains its own files, directories, and paths so that you can work with specific versions of libraries or Python itself without affecting other Python projects. Virtual environmets … Read more

SQL Server

Columnstore A columnstore index can provide a very high level of data compression, typically by 10 times, to significantly reduce your data warehouse storage cost. For analytics, a columnstore index offers an order of magnitude better performance than a btree index. Columnstore indexes are the preferred data storage format for data warehousing and analytics workloads. … Read more

Microsoft Big Data Overview

https://academy.microsoft.com/en-us/professional-program/tracks/big-data/ Block 1 – Data Fundamentals Learn data science basics. Explore topics like data queries, data analysis, data visualization and how statistics informs data science practices. Please choose from Course 2a or Course 2b to complete the unit. Course 1: Microsoft Professional Program: Introduction to Big Data Course 2a: Analyzing and Visualizing Data with Power … Read more

Data Science Austria

The last few months I set out to build up to build a news and event aggregator. You can see the work in progress here: data-science-austria.at WordPress Plugins Here is a list of plugins that I use for the site grouped by the general overall purpose. The first one is a collection that I would … Read more

Docker Basics

Docker is a tool which helps developers build and ship high quality applications, faster, anywhere. Source Why Docker With Docker, developers can build any app in any language using any toolchain. Dockerized apps are completely portable and can run anywhere. Developers can get going by just spinning any container out of list on Docker Hub. … Read more

Scrum PSM I

After getting scrum.org the PSM I I wanted to capture the relevant content. The complete guido can be downloaded here: scrumguides.org 1. What is Scrum? Scrum is a framework for developing and sustaining complex products. A framework in which complex adaptive problems can be addressed. It is lightweight, simple to understand and yet difficult to … Read more

Blogging with hugo & netlify

Here is a great tutorial on how to host hugo on netlify Other examples using the exact same theme: Creating the hugo site In order to create a new hugo site simply go: hugo new site [path] [flags] Create a new repository via git init the git repo and push it to the guthub repo: … Read more

Office Ribbons

I am an absolute fan of adapting your work environment to your needs. Spending an hour to set up some shortcuts is virtually always a good time investment. Then you can easily drag your most used commands into a new bar. You should be able to save a lot of time on, e.g. aligning objects … Read more

Coding the Matrix

How to test the solutions python3 submit.py python_lab.py Lab 1: Introduction to Python—sets, lists, dictionaries, and comprehensions Python provides some simple data structures for grouping together multiple values, and integrates them with the rest of the language. These data structures are called collections. Sets A set is an unordered collection in which each value occurs … Read more

Most useful git commands

This is an overview of the most common git commands. I strongly recommend knowing its basic and using it to anybody writing code, whether alone or particularly within a team. To get a quick theoretical introduction into the topics please have a look here: Getting Started – Git Basics Setup Set your details git config … Read more

Conway’s Law

Many organizations have become adept at identifying what they need from software development projects, based on a keen understanding of their business goals. Even so, they’re often surprised to find out that the end results don’t achieve the transformative impact they were expecting. Their mistake? Overlooking the importance of Conway’s Law. In 1967, Melvin Conway … Read more

Famous Laws of Software Development

Murphy’s Law Probably one of the most famous of all laws, mostly because it is not only applicable to Software Development. If something can go wrong, it will. First derivation: If it works, you probably didn’t write it. Second derivation: Cursing is the only language all programmers speak fluently. Conclusion: A computer will do what … Read more

Logistic Regression

Don’t get confused by its name! It is a classification not a regression algorithm. It is used to estimate discrete values (binary values like 0/1, yes/no, true/false ) based on given set of independent variable(s). In simple words, it predicts the probability of occurrence of an event by fitting data to a logit function. Hence, … Read more

Machine Learning Overview

Broadly, there are three types of Machine Learning Algorithms.. 1. Supervised Learning How it works: This algorithm consist of a target or outcome variable (or dependent variable) which is to be predicted from a given set of predictors (independent variables). Using these set of variables, we generate a function that map inputs to desired outputs. … Read more

Applied Predictive Modelling

Source cran Chapter 1 Introduction Prediction Versus Interpretation, Key Ingredients of Predictive Models; Terminology; Example Data Sets and Typical Data Scenarios; Overview; Notation (15 pages, 3 figures) Part I: General Strategies Chapter 2 A Short Tour of the Predictive Modeling Process Case Study: Predicting Fuel Economy; Themes; Summary (8 pages, 6 figures, R packages used) … Read more

Reading List

Blogs Data Science Austria: data-science-austria.at persönlicher Blog: harlecin.netlify.com Hackernews: news.ycombinator.com R-Bloggers: r-bloggers.com Reddit: reddit.com/r/MachineLearning Favorite

Jeff Bezos’ API Mandate

The best article I’ve ever read about architecture and the management of IT. I was at Amazon for about six and a half years, and now I’ve been at Google for that long. One thing that struck me immediately about the two companies – an impression that has been reinforced almost daily – is that … Read more

Docker Python

Testing the base image docker run python:3 /bin/echo ‘Hello world’ docker run is a command to run a container. python:3 is the image you run. For example, the Ubuntu operating system image. When you specify an image, Docker looks first for the image on your Docker host. If the image does not exist locally, then … Read more

EARL 2018, London

Conference Day 1 {.tabset .tabset-fade} Wednesday 12 September ###Keynote ####Edwina Dunn, Starcount ####Garrett Grolemund, RStudio ###Session 1 ####1. “A Validated R Environment in the Cloud for Life Science R&D” Jobst Loffler, Bayer Business Services GmbH Waiting on Rstudio Item 2 ####2. “A brief history of Data at Autotrader; how R has got us here” Paul … Read more