Test-Driven Development (TDD)

"Test-driven development (TDD) is a software development process that relies on the repetition of a very short development cycle: requirements are turned into very specific test cases, then the software is improved to pass the new tests, only. This is opposed to software development that allows software to be added that is not proven to

RStudio Addin

If you want to create your own RStudio addins, all you need to do is: Create an R package Create some R functions Create a file at inst/rstudio/addins.dcf Links 1. Create am R Package Set up tools for package development library(devtools) library(roxygen2) # getwd() # setwd("path/to/repo") Create Package I am mainly following: https://hilaryparker.com/2014/04/29/writing-an-r-package-from-scratch/ create("rstudio_addin") This

Python Cheatsheet

The following gives an overview of the most basic concepts found in python. It serves as a quick reminder when not having coded in python for a while. Source: Download text file or Fork me on GitHub Main if __name__ == '__main__': main() List <list> = <list>[from_inclusive : to_exclusive : step_size] <list>.append(<el>) <list>.extend(<collection>) <list> +=

Airbnb Knowledge Repo

The Knowledge Repo project is focused on facilitating the sharing of knowledge between data scientists and other technical roles using data formats and tools that make sense in these professions. It provides various data stores (and utilities to manage them) for "knowledge posts", with a particular focus on notebooks (R Markdown and Jupyter / IPython

Power BI

Using Power BI and R Tutorial here: Run R scripts in Power BI Desktop The only twist that I want to add is an idea on how to enable users without admin access to run R code. This can be achieved by storing a portable r installation on a mountable file storage. R Download the

Python Virtual Environment

Conda How to set up a virtual environments using conda for the Anaconda Python distribution A virtual environment is a named, isolated, working copy of Python that that maintains its own files, directories, and paths so that you can work with specific versions of libraries or Python itself without affecting other Python projects. Virtual environmets

SQL Server

Columnstore A columnstore index can provide a very high level of data compression, typically by 10 times, to significantly reduce your data warehouse storage cost. For analytics, a columnstore index offers an order of magnitude better performance than a btree index. Columnstore indexes are the preferred data storage format for data warehousing and analytics workloads.

Microsoft Big Data Overview

https://academy.microsoft.com/en-us/professional-program/tracks/big-data/ Block 1 – Data Fundamentals Learn data science basics. Explore topics like data queries, data analysis, data visualization and how statistics informs data science practices. Please choose from Course 2a or Course 2b to complete the unit. Course 1: Microsoft Professional Program: Introduction to Big Data Course 2a: Analyzing and Visualizing Data with Power

Scrum PSM I

After getting scrum.org the PSM I I wanted to capture the relevant content. The complete guido can be downloaded here: scrumguides.org 1. What is Scrum? Scrum is a framework for developing and sustaining complex products. A framework in which complex adaptive problems can be addressed. It is lightweight, simple to understand and yet difficult to

Coding the Matrix

How to test the solutions python3 submit.py python_lab.py Lab 1: Introduction to Python—sets, lists, dictionaries, and comprehensions Python provides some simple data structures for grouping together multiple values, and integrates them with the rest of the language. These data structures are called collections. Sets A set is an unordered collection in which each value occurs

Conway’s Law

Many organizations have become adept at identifying what they need from software development projects, based on a keen understanding of their business goals. Even so, they're often surprised to find out that the end results don't achieve the transformative impact they were expecting. Their mistake? Overlooking the importance of Conway's Law. In 1967, Melvin Conway

Famous Laws of Software Development

Murphy's Law Probably one of the most famous of all laws, mostly because it is not only applicable to Software Development. If something can go wrong, it will. First derivation: If it works, you probably didn't write it. Second derivation: Cursing is the only language all programmers speak fluently. Conclusion: A computer will do what

Machine Learning Overview

Broadly, there are three types of Machine Learning Algorithms.. 1. Supervised Learning How it works: This algorithm consist of a target or outcome variable (or dependent variable) which is to be predicted from a given set of predictors (independent variables). Using these set of variables, we generate a function that map inputs to desired outputs.

Applied Predictive Modelling

Source cran Chapter 1 Introduction Prediction Versus Interpretation, Key Ingredients of Predictive Models; Terminology; Example Data Sets and Typical Data Scenarios; Overview; Notation (15 pages, 3 figures) Part I: General Strategies Chapter 2 A Short Tour of the Predictive Modeling Process Case Study: Predicting Fuel Economy; Themes; Summary (8 pages, 6 figures, R packages used)

EARL 2018, London

Conference Day 1 {.tabset .tabset-fade} Wednesday 12 September ###Keynote ####Edwina Dunn, Starcount ####Garrett Grolemund, RStudio ###Session 1 ####1. "A Validated R Environment in the Cloud for Life Science R&D" Jobst Loffler, Bayer Business Services GmbH Waiting on Rstudio Item 2 ####2. "A brief history of Data at Autotrader; how R has got us here" Paul