Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

Tech Guides

852 Articles
article-image-what-is-seaborn-and-why-should-you-use-it-for-data-visualization
Erik Kappelman
30 Jan 2018
6 min read
Save for later

What is Seaborn and why should you use it for data visualization?

Erik Kappelman
30 Jan 2018
6 min read
Seaborn is a Python library created for enhanced data visualization. It's a very timely and relevant tool for data professionals working today precisely because effective data visualization – and communication in general – is a particularly essential skill. Being able to bridge the gap between data and insight is hugely valuable, and Seaborn is a tool that fits comfortably in the toolchain of anyone interested in doing just that. There are, of course, a huge range of data visualization libraries out there – but if you're wondering why you should use Seaborn, put simply it brings some serious power to the table that other tools can’t quite match. Follow this Seaborn tutorial and you’ll find out what makes Seaborn such a good data visualization library. How to get started with Seaborn To get started, I recommend becoming familiar with Anaconda, if you are not already. I find that using Anaconda and its various tools makes coding in Python, especially package and library management, a whole lot easier. So, let's load the packages we are going to need. (I am assuming you have already downloaded and setup Seaborn.) import numpy as np import matplotlib.pyplot as plt import seaborn as sns import pandas as pd Now that we have our packages on board, let's just make a basic plot. The function below creates a series of sine functions and then graphs all of these functions; take a look: np.random.seed(sum(map(ord, "aesthetics"))) def sinplot(flip=1): x = np.linspace(0, 14, 100) for i in range(1, 7): plt.plot(x, np.sin(x + i * .5) * (7 - i) * flip) sin = sinplot() plt.savefig("sin.png") It’s a pretty basic set of sine curves, and while it looks pretty professional and clean, it doesn’t really tell us much more about what makes Seaborn unique. So what makes Seaborn different? What are the benefits of Seaborn? Well, let's take a look at what Seaborn refers to as ‘joint plots.’ These plots pair a scatter plot with the distribution of each variable in the scatter plot on the axes. Let's look at the code for the next two graphs and then we’ll discuss why they matter: join1 = sns.jointplot(x="x", y="y", data=df); join1.savefig("join1.png") join2= sns.jointplot(x="x", y="y", data=df, kind="kde"); join2.savefig("join2.png") plt.clf() This plot isn’t unique to Seaborn. I've created very similar plots in R, however, that plot took one single line of code. In R, at the very least you're looking at five or six lines, and you’re going to have to use the default plotting package because I’ve never been able to figure out marginal plots in ggplot2. Graphs like this really show us a lot about the data we are examining. We can simultaneously see that the two sets of data are correlated and that they are both somewhat skewed and non-normal, although the y variable could probably pass as normal. If marginal plots were this easy in R, I would leverage them a whole lot more because they are informative. The next plot, however, is different. In fact, I hadn’t really seen something like it before I learned about Seaborn. This plot uses a kernel density plot instead of a scatter plot, and the distributions are estimated smoothly instead of using histograms. This could be a helpful graph if you were specifically interested in densities and correlations as well as the distributions of the data. This could be quite beneficial in various spatial analysis applications, as well as traditional statistical fields. The third join plot includes a regression line in the scatter plot as well as an assessment of the fit of the linear model used. The code used to produce this plot is below: tips = sns.load_dataset('tips') sns.jointplot(x="total_bill", y="tip", data=tips, kind="reg"); plt.savefig('join3.png') The inclusion of error fields around the line helps you to better visualize the accuracy of the linear regression. Additionally, the distribution of the data is available in the margins. Normally, it would take three separate graphs to convey all of this information. Seaborn makes this much simpler. With a single line of code, we are able to create a graph that covers all of the relevant information related to this linear regression. Another somewhat novel graph type that’s available in Seaborn is the violin plot. Again, we can create this complex graph with the simple code shown below: iris = sns.load_dataset("iris") sns.violinplot(x=iris.species, y=iris.sepal_length, data=iris); plt.savefig("violin.png") This is data from the famous Iris data set. The violin plot is essentially an amalgamation of a box plot and a kernel density estimate of a distribution. Both box plots and graphs of univariate distributions are very helpful when first beginning analysis of some dataset. Again, Seaborn takes a lot out of the work of this process by making it easy to produce single graphs that would normally take multiple graphs using other analysis tools. The final chart I would like to show is really useful. It summarizes the results of univariate logistic regression graphically. This is a tough thing to display and until I came across Seaborn I had really never seen an example I would consider good. The chart is created with the code below: tips['big_tip'] = tips['tip']/tips['total_bill'] >= 0.2 sns.lmplot(x="total_bill", y="big_tip", data=tips,logistic=True, y_jitter=.03); plt.savefig("tiplogit.png") The chart displays the results of the regression a binary indicator if a tip was larger than 20 percent or ‘big’ against the total cost of the meal: The chart illustrates very clearly that people are not tipping as much when their meals are more expensive, at least in terms of proportions. Summarizing the results of logistic regressions is always challenging, but as you can see, thanks to Seaborn, you can do a pretty good job with just one line of code. Seaborn is simply a really great library that's worth your time exploring – I hope this post has convinced you and inspired you to go and try it for yourself if you haven't already. There is always room for improvement when it comes to data visualization. Seaborn might be the improvement you need. I know I'll be using it.
Read more
  • 0
  • 0
  • 48189

article-image-15-useful-python-libraries-to-make-your-data-science-tasks-easier
Amey Varangaonkar
12 Feb 2018
10 min read
Save for later

15 Useful Python Libraries to make your Data Science tasks Easier

Amey Varangaonkar
12 Feb 2018
10 min read
Python has become a big hit in the Data Science community over the last five years. So much so that it is slowly taking over R - the ‘lingua franca of statistics’ - as the preferred choice of tool for many. The recently published Stack Overflow Developer Survey 2018 suggests Python is the next big programming language, and its adoption in the industry is only going to increase. Python’s rise has been staggering, but not really surprising. Its general-purpose nature, coupled with the efficiency and ease of use make it easier for you to build your data science solutions without any hassle. You also have a rich suite of Python libraries available at your disposal for all your Data Science-related tasks - from basic web scraping to something as complex as training deep learning models. In this article, we take a look at some of the most popular and widely used Python libraries and their application areas. Web Scraping Web scraping is a popular information extraction technique from the web using the HTTP protocol, with the help of a web browser. The two most commonly used tools for web scraping are, unsurprisingly, Python-based. 1.Beautiful Soup Beautiful Soup is a popular Python library for extracting information out of the HTML and XML files. It provides a unique, easy way to navigate, search and modify the parsed data, potentially saving you hours of needless work. It works with both the versions of Python, i.e. 2.7 and 3.x and is very easy to use. Check out our latest tutorial on how to scrape web page using the Beautiful Soup. [box type="info" align="" class="" width=""]Editor's Tip: If you’re new to the concept of web scraping, Beautiful Soup should be your go-to library. You can learn more about how to use this library more efficiently in our book Python Web Scraping Cookbook [/box] 2.Scrapy Scrapy is a free, open source framework written in Python. Although developed for web scraping, it can also be used as a general web crawler and extract data using different APIs. Following the ‘Don’t Repeat Yourself’ philosophy of frameworks such as Django, Scrapy includes a set of self-contained crawlers, with each of them following specific instructions with a specific objective. [box type="info" align="" class="" width=""]Editor’s tip: To learn how to use Scrapy for your scraping projects, our book Python Web Scraping, Second Edition is definitely worth checking out. [/box] Scientific Computation and Data Analysis Arguably the most common data science tasks, Python proves to be of great worth to data scientists by providing unique libraries for data manipulation and analysis, as well as mathematical computation. 3. NumPy NumPy is the most popular library for scientific computing in Python and is a part of the larger Python stack for scientific computation called SciPy (discussed below). Apart from its uses in linear algebra and other mathematical functions, it can also be used as a multi-dimensional container, or array, of generic data with arbitrary data types. NumPy integrates seamlessly languages such as C/C++ and because of its support for multiple data types, it works well with a variety of databases as well. 4. SciPy SciPy is a Python-based framework containing open source libraries for mathematics, scientific computation and data analysis.  The SciPy library is a collection of algorithms and tools for advanced mathematical computations, statistics and much more. The SciPy stack consists of the following libraries: NumPy - Python package for numerical computation SciPy - One of the core packages of the SciPy stack for signal processing, optimization and advanced statistics matplotlib - Popular Python library for data visualization SymPy - Library for symbolic mathematics and algebra pandas - Python library for data manipulation and analysis iPython -  Interactive console to run Python-based code 5. pandas pandas is a widely used Python package providing data structures and tools for effective data manipulation and analysis. It is a popularly used tool for Quantitative Analysis and finds a lot of application in algorithmic trading and risk analysis. With a large community of dedicated users, pandas is regularly updated to get new API changes, performance updates and bug fixes. This is one library you definitely need to work with to truly realize its power. [box type="info" align="" class="" width=""]Editor's Tip: To get a more hands-on understanding of how to effectively use pandas for data analysis, make sure you check out our highly popular title pandas Cookbook.[/box] Machine Learning and Deep Learning Python trumps all other languages when it comes to implementing efficient machine learning and deep learning models, simply by virtue of its diverse, effective and easy to use set of libraries. It is worth having a look at the experts’ take on why Python is great for machine learning and Artificial Intelligence. In this section, we see some of the most popular and commonly used Python libraries for machine learning and deep learning: 6. Scikit-learn scikit-learn is the most popular Python library for data mining, analysis and machine learning. It is built using the capabilities of NumPy, SciPy and matplotlib, and is commercially usable. You can implement a variety of machine learning techniques such as classification, regression, clustering and more, using scikit-learn. It is very easy to install and has a clean, slick documentation for anyone looking to get started with it. [box type="info" align="" class="" width=""]Editor’s tip: To understand how to use scikit-learn in your machine learning projects, our bestselling book Python Machine Learning, Second Edition is all you need. If you’re looking to specifically master scikit-learn, Mastering Machine Learning with scikit-learn will prove to be a very useful resource. Check it out! [/box] 7. Tensorflow Tensorflow is the popular machine learning library everyone seems to be talking about today. It is a Python-based framework for effective machine learning and deep learning using multiple CPUs or GPUs. Backed by Google, it was initially developed by the research team of Google Brain, and is the widely used framework in the world for machine intelligence. It enjoys the support of a large community of active users and is finding widespread application for advanced machine learning across a multitude of industrial domains - from manufacturing and retail to healthcare and smart cars. If you are interested to know more about Tensorflow, you can quickly check out the tutorial here. [box type="info" align="" class="" width=""]Editor's Tip: Tensorflow being the most popular framework for machine learning and deep learning, it is one library you should definitely master. Check out the following books to skill up quickly! Machine Learning with TensorFlow 1.x TensorFlow Machine Learning Cookbook Deep Learning with TensorFlow Tensorflow 1.x Deep Learning Cookbook Mastering Tensorflow 1.x [/box] 8. Keras Keras is a Python-based neural networks API, and offers a simplified interface to train and deploy your deep learning models with ease. It has support for a variety of deep learning frameworks such as Tensorflow, Deeplearning4j and CNTK. Keras is very user-friendly, follows a modular approach and supports both CPU and GPU-based computations. If you want to make the deep learning process simpler and effective, this library is definitely worth checking out! [box type="info" align="" class="" width=""]Editor's Tip: If you’re looking for a resource that teaches you how to use Keras effectively, our trending book Deep Learning with Keras will be of great help to you! [/box] 9. PyTorch One of the more recent additions to Python deep learning family is PyTorch, a neural network modeling library with strong GPU support. Although still in a beta stage, this project is backed by bigwigs such as Facebook and Twitter. PyTorch builds on the architecture of Torch, another popular deep library, to enable more efficient tensor computation and implementation of dynamic neural networks. [box type="info" align="" class="" width=""]Editor's Tip: Here is Deep Learning with PyTorch to get you started with this amazing tool. [/box] Natural Language Processing Natural Language Processing pertains to designing of systems that process, interpret and analyze human language, spoken or written. Python offers unique libraries for performing a variety of tasks such as working with structured and unstructured text, predictive analytics and much more. 10. NLTK NLTK is a popular Python library for language processing. It offers easy to use interfaces for a variety of NLP tasks such as text classification, tokenization, text parsing, semantic reasoning and much more. It is an open source, community-driven project, and has support for both Python 2 and Python 3. 11. SpaCy SpaCy is another library for advanced natural language processing, based on Python and Cython. It has an extensive support for various deep learning libraries and frameworks such as Tensorflow and PyTorch. With SpaCy, you can build complex statistical models for NLP with relative ease. SpaCy is easy to install and use, and proves to be of great help when it comes to large-scale extracting and analyzing of textual information. [box type="info" align="" class="" width=""]Editor's Tip: To know more about how these libraries are used for natural language processing, make sure you check out the book Natural Language Processing with Python Cookbook [/box] Data Visualization Data visualization is a popularly used Data Science technique for visually analysing and communicating information and valuable business insights through graphs, charts, dashboards and reports. Python offers a lot of popular libraries for effective data storytelling. Some of them are listed below: 12. matplotlib matplotlib is the most popular Python library for data visualization which allows for enterprise-grade 2D and 3D plotting. With matplotlib, you can build different kinds of visualizations such as histograms, bar charts, scatter plots and much more, with just a few lines of code. The popularity of matplotlib rivals that of R’s highly acclaimed ggplot2, and deciding which library is better has been a hot topic for debate, for many years now. Matplotlib runs seamlessly on all Python consoles, including iPython and Jupyter notebooks, giving you all the necessary tools to create and share your data visualizations with others. [box type="info" align="" class="" width=""]Editor's Tip: Get started with matplotlib today, with the help of Matplotlib 2.x By Example [/box] 13. Seaborn Seaborn is a Python-based data visualization library, which finds its roots in matplotlib. Apart from offering attractive and insightful data visualizations, seaborn also offers strong support for other Python libraries such as NumPy and pandas. Per the official seaborn page: “If matplotlib “tries to make easy things easy and hard things possible”, seaborn tries to make a well-defined set of hard things easy too.” 14. Bokeh Bokeh is an interactive data visualization library based on Python. It aims to provide D3.js style elegant graphics and visualizations and runs primarily on modern web browsers. Apart from the ability to create a wide variety of visualizations, Bokeh also supports large-scale interactivity and visualizations of real-time datasets. 15. Plotly Plotly is a popularly used Python library which is used across the world for making publication-quality plots and graphs. With Plotly, you can build interactive dashboards, scatter plots, histograms, candlestick charts, heat maps, and a whole host of other data visualizations with ease. With superior interactivity, deployment and publication capabilities, Plotly is used across different domains, majorly finance and geospatial industries for effective data storytelling. So there you have it! Python has an extensive suite of libraries for every data science related task, each equipped with unique features to make the task fast and hassle-free. While there are a lot more Python libraries out there, we cherry-picked these 15 libraries based on their popularity, usefulness and the value they bring to the table. Also, the extensive community support for Python means you can get help for any kind of problem you might come across while using these tools. It's time now for you to go out there and crunch some data with some of these Python powered libraries!
Read more
  • 0
  • 0
  • 48122

article-image-libraries-for-geospatial-analysis
Aarthi Kumaraswamy
22 May 2018
12 min read
Save for later

Top 7 libraries for geospatial analysis

Aarthi Kumaraswamy
22 May 2018
12 min read
The term geospatial refers to finding information that is located on the earth's surface. This can include, for example, the position of a cellphone tower, the shape of a road, or the outline of a country. Geospatial data often associates some piece of information with a particular location. Geospatial development is the process of writing computer programs that can access, manipulate, and display this type of information. Internally, geospatial data is represented as a series of coordinates, often in the form of latitude and longitude values. Additional attributes, such as temperature, soil type, height, or the name of a landmark, are also often present. There can be many thousands (or even millions) of data points for a single set of geospatial data. In addition to the prosaic tasks of importing geospatial data from various external file formats and translating data from one projection to another, geospatial data can also be manipulated to solve various interesting problems. Obvious examples include the task of calculating the distance between two points, calculating the length of a road, or finding all data points within a given radius of a selected point. We use libraries to solve all of these problems and more. Today we will look at the major libraries used to process and analyze geospatial data. GDAL/OGR GEOS Shapely Fiona Python Shapefile Library (pyshp) pyproj Rasterio GeoPandas This is an excerpt from the book, Mastering Geospatial Analysis with Python by Paul Crickard, Eric van Rees, and Silas Toms. Geospatial Data Abstraction Library (GDAL) and the OGR Simple Features Library The Geospatial Data Abstraction Library (GDAL)/OGR Simple Features Library combines two separate libraries that are generally downloaded together as a GDAL. This means that installing the GDAL package also gives access to OGR functionality. The reason GDAL is covered first is that other packages were written after GDAL, so chronologically, it comes first. As you will notice, some of the packages covered in this post extend GDAL's functionality or use it under the hood. GDAL was created in the 1990s by Frank Warmerdam and saw its first release in June 2000. Later, the development of GDAL was transferred to the Open Source Geospatial Foundation (OSGeo). Technically, GDAL is a little different than your average Python package as the GDAL package itself was written in C and C++, meaning that in order to be able to use it in Python, you need to compile GDAL and its associated Python bindings. However, using conda and Anaconda makes it relatively easy to get started quickly. Because it was written in C and C++, the online GDAL documentation is written in the C++ version of the libraries. For Python developers, this can be challenging, but many functions are documented and can be consulted with the built-in pydoc utility, or by using the help function within Python. Because of its history, working with GDAL in Python also feels a lot like working in C++ rather than pure Python. For example, a naming convention in OGR is different than Python's since you use uppercase for functions instead of lowercase. These differences explain the choice for some of the other Python libraries such as Rasterio and Shapely, which are also covered in this chapter, that has been written from a Python developer's perspective but offer the same GDAL functionality. GDAL is a massive and widely used data library for raster data. It supports the reading and writing of many raster file formats, with the latest version counting up to 200 different file formats that are supported. Because of this, it is indispensable for geospatial data management and analysis. Used together with other Python libraries, GDAL enables some powerful remote sensing functionalities. It's also an industry standard and is present in commercial and open source GIS software. The OGR library is used to read and write vector-format geospatial data, supporting reading and writing data in many different formats. OGR uses a consistent model to be able to manage many different vector data formats. You can use OGR to do vector reprojection, vector data format conversion, vector attribute data filtering, and more. GDAL/OGR libraries are not only useful for Python programmers but are also used by many GIS vendors and open source projects. The latest GDAL version at the time of writing is 2.2.4, which was released in March 2018. GEOS The Geometry Engine Open Source (GEOS) is the C/C++ port of a subset of the Java Topology Suite (JTS) and selected functions. GEOS aims to contain the complete functionality of JTS in C++. It can be compiled on many platforms, including Python. As you will see later on, the Shapely library uses functions from the GEOS library. In fact, there are many applications using GEOS, including PostGIS and QGIS. GeoDjango, also uses GEOS, as well as GDAL, among other geospatial libraries. GEOS can also be compiled with GDAL, giving OGR all of its capabilities. The JTS is an open source geospatial computational geometry library written in Java. It provides various functionalities, including a geometry model, geometric functions, spatial structures and algorithms, and i/o capabilities. Using GEOS, you have access to the following capabilities—geospatial functions (such as within and contains), geospatial operations (union, intersection, and many more), spatial indexing, Open Geospatial Consortium (OGC) well-known text (WKT) and well-known binary (WKB) input/output, the C and C++ APIs, and thread safety. Shapely Shapely is a Python package for manipulation and analysis of planar features, using functions from the GEOS library (the engine of PostGIS) and a port of the JTS. Shapely is not concerned with data formats or coordinate systems but can be readily integrated with such packages. Shapely only deals with analyzing geometries and offers no capabilities for reading and writing geospatial files. It was developed by Sean Gillies, who was also the person behind Fiona and Rasterio. Shapely supports eight fundamental geometry types that are implemented as a class in the shapely.geometry module—points, multipoints, linestrings, multilinestrings, linearrings, multipolygons, polygons, and geometrycollections. Apart from representing these geometries, Shapely can be used to manipulate and analyze geometries through a number of methods and attributes. Shapely has mainly the same classes and functions as OGR while dealing with geometries. The difference between Shapely and OGR is that Shapely has a more Pythonic and very intuitive interface, is better optimized, and has a well-developed documentation. With Shapely, you're writing pure Python, whereas with GEOS, you're writing C++ in Python. For data munging, a term used for data management and analysis, you're better off writing in pure Python rather than C++, which explains why these libraries were created. For more information on Shapely, consult the documentation. This page also has detailed information on installing Shapely for different platforms and how to build Shapely from the source for compatibility with other modules that depend on GEOS. This refers to the fact that installing Shapely will require you to upgrade NumPy and GEOS if these are already installed. Fiona Fiona is the API of OGR. It can be used for reading and writing data formats. The main reason for using it instead of OGR is that it's closer to Python than OGR as well as more dependable and less error-prone. It makes use of two markup languages, WKT and WKB, for representing spatial information with regards to vector data. As such, it can be combined well with other Python libraries such as Shapely, you would use Fiona for input and output, and Shapely for creating and manipulating geospatial data. While Fiona is Python compatible and our recommendation, users should also be aware of some of the disadvantages. It is more dependable than OGR because it uses Python objects for copying vector data instead of C pointers, which also means that they use more memory, which affects the performance. Python shapefile library (pyshp) The Python shapefile library (pyshp) is a pure Python library and is used to read and write shapefiles. The pyshp library's sole purpose is to work with shapefiles—it only uses the Python standard library. You cannot use it for geometric operations. If you're only working with shapefiles, this one-file-only library is simpler than using GDAL. pyproj The pyproj is a Python package that performs cartographic transformations and geodetic computations. It is a Cython wrapper to provide Python interfaces to PROJ.4 functions, meaning you can access an existing library of C code in Python. PROJ.4 is a projection library that transforms data among many coordinate systems and is also available through GDAL and OGR. The reason that PROJ.4 is still popular and widely used is two-fold: Firstly, because it supports so many different coordinate systems Secondly, because of the routes it provides to do this—Rasterio and GeoPandas, two Python libraries covered next, both use pyproj and thus PROJ.4 functionality under the hood The difference between using PROJ.4 separately instead of using it with a package such as GDAL is that it enables you to re-project individual points, and packages using PROJ.4 do not offer this functionality. The pyproj package offers two classes—the Proj class and the Geod class. The Proj class performs cartographic computations, while the Geod class performs geodetic computations. Rasterio Rasterio is a GDAL and NumPy-based Python library for raster data, written with the Python developer in mind instead of C, using Python language types, protocols, and idioms. Rasterio aims to make GIS data more accessible to Python programmers and helps GIS analysts learn important Python standards. Rasterio relies on concepts of Python rather than GIS. Rasterio is an open source project from the satellite team of Mapbox, a provider of custom online maps for websites and applications. The name of this library should be pronounced as raster-i-o rather than ras-te-rio. Rasterio came into being as a result of a project called the Mapbox Cloudless Atlas, which aimed to create a pretty-looking basemap from satellite imagery. One of the software requirements was to use open source software and a high-level language with handy multi-dimensional array syntax. Although GDAL offers proven algorithms and drivers, developing with GDAL's Python bindings feels a lot like C++. Therefore, Rasterio was designed to be a Python package at the top, with extension modules (using Cython) in the middle, and a GDAL shared library on the bottom. Other requirements for the raster library were being able to read and write NumPy ndarrays to and from data files, use Python types, protocols, and idioms instead of C or C++ to free programmers from having to code in two languages. For georeferencing, Rasterio follows the lead of pyproj. There are a couple of capabilities added on top of reading and writing, one of them being a features module. Reprojection of geospatial data can be done with the rasterio.warp module. Rasterio's project homepage can be found on Github. GeoPandas GeoPandas is a Python library for working with vector data. It is based on the pandas library that is part of the SciPy stack. SciPy is a popular library for data inspection and analysis, but unfortunately, it cannot read spatial data. GeoPandas was created to fill this gap, taking pandas data objects as a starting point. The library also adds functionality from geographical Python packages. GeoPandas offers two data objects—a GeoSeries object that is based on a pandas Series object and a GeoDataFrame, based on a pandas DataFrame object, but adding a geometry column for each row. Both GeoSeries and GeoDataFrame objects can be used for spatial data processing, similar to spatial databases. Read and write functionality is provided for almost every vector data format. Also, because both Series and DataFrame objects are subclasses from pandas data objects, you can use the same properties to select or subset data, for example .loc or .iloc. GeoPandas is a library that employs the capabilities of newer tools, such as Jupyter Notebooks, pretty well, whereas GDAL enables you to interact with data records inside of vector and raster datasets through Python code. GeoPandas takes a more visual approach by loading all records into a GeoDataFrame so that you can see them all together on your screen. The same goes for plotting data. These functionalities were lacking in Python 2 as developers were dependent on IDEs without extensive data visualization capabilities which are now available with Jupyter Notebooks. We've provided an overview of the most important open source packages for processing and analyzing geospatial data. The question then becomes when to use a certain package and why. GDAL, OGR, and GEOS are indispensable for geospatial processing and analyzing, but were not written in Python, and so they require Python binaries for Python developers. Fiona, Shapely, and pyproj were written to solve these problems, as well as the newer Rasterio library. For a more Pythonic approach, these newer packages are preferable to the older C++ packages with Python binaries (although they're used under the hood). Now that you have an idea of what options are available for a certain use case and why one package is preferable over another, here’s something you should always remember. As is often the way in programming, there might be multiple solutions for one particular problem. For example, when dealing with shapefiles, you could use pyshp, GDAL, Shapely, or GeoPandas, depending on your preference and the problem at hand. Introduction to Data Analysis and Libraries 15 Useful Python Libraries to make your Data Science tasks Easier “Pandas is an effective tool to explore and analyze data”: An interview with Theodore Petrou Using R to implement Kriging – A Spatial Interpolation technique for Geostatistics data  
Read more
  • 0
  • 0
  • 47778

article-image-react-js-why-you-should-learn-the-front-end-javascript-library-and-how-to-get-started
Guest Contributor
25 Aug 2019
9 min read
Save for later

React.js: why you should learn the front end JavaScript library and how to get started

Guest Contributor
25 Aug 2019
9 min read
React.JS is one of the most powerful JavaScript libraries. It empowers the interface of major organisations such as Amazon (an e-commerce giant has recently introduced a programming language of its own), PayPal, BBC, CNN, and over a million other websites worldwide. Created by Facebook, React.JS has quickly built a daunting technical reputation and a loyal fan following. Currently React.js is extensively mentioned in job openings - companies want to hire dedicated react.js developer more than Vue.js engineers. In this post, you’ll find out why React.JS is the right framework to start your remote work, despite the library’s steep learning curve and what are the ways to use it more efficiently. 5 Reasons to learn React.JS Developers might be hesitant to learn React as it’s not a full-fledged framework and a developer needs to handle models and controllers on their own. Nevertheless, there are more than a handful of reasons to become a react js developer. Let’s take a closer look at them: 1. It’s functional There’s no need to use classes in React. The platform relies heavily on functional components, allowing developers not to overcomplicate the codebase. While classes offer developers a handful of convenient features (using life cycle hooks, and such), the benefits provided by the functional syntax are loud and clear: Higher readability. Properties like state functions or lifecycle hooks tend to make reading and testing the code a pain in the neck. Plain JS functions are easier to wrap your head around. A developer can achieve the same functionality with less code. The software engineering team will more likely adhere to best practices. Stateless functional components encourage front-end engineers to separate presentational and container components. It takes more time to adjust to a more complex workflow - in the long run, it pays off in a better code structure. ES6 destructuring helps spot bloated components. A developer can see the list of dependencies bound to every component. As a result, you will be able to break up overly complex structures or rethink them altogether. React.JS is the tool that recognizes the power of functional components to their fullest extent (even the glorified Angular 2 can’t compare). As a result, developers can strive for maximum code eloquence and improved performance. 2. It’s declarative Most likely, you are no stranger to CSS and the SQL database programming language, and, as such, are familiar with declarative programming. Still, to recap, here are the differences between declarative and imperative approaches: Imperative programming uses statements to manipulate the state of the program. Declarative programming is a paradigm that changes the system based on the communication logic. While imperative programming gives developers a possibility to design a control flow step-by-step in statements and may come across as easier, it is declarative programming to have more perks in the long run. Higher readability. Low-level details will not clutter the code as the paradigm is not concerned with them. More freedom for reasoning. Instead of outlining the procedure step-by-step, a  successful React JS developer focuses on describing the solution and its logic. Reusability. You can apply a declarative description to various scenarios - that is times more challenging for a step-by-step construct. Efficient in solving specific domain problems. High performance of declarative programming stems from the fact that it adapts to the domain. For databases, for instance, a developer will create a set of operations to handle data, and so on. Capitalizing on the benefits of declarative programming is React’s strong point. You will be able to create transparent, reusable, and highly readable interfaces. 3. Virtual DOM Developers that manage high-load projects often face DOM-related challenges. Bottlenecks tend to appear even after a small change in the document-object-model. Due to the document object model’s tree structure, there’s a high interconnectivity between DOM components. To facilitate maintenance, Facebook has implemented the virtual DOM in React.JS. It allows developers to ensure the project’s error-free performance before updating an actual DOM tree. Virtual DOM provides extra assurance in the app’s performance - in the long run, it significantly improves user satisfaction rates. 4. Downward data binding As opposed to Angular two-way data binding, React.JS uses the downward structure to ensure the changes in child structures will not affect parents. A developer can only transfer data from a parent to a child, not vice versa. The key components of downward data binding include: Passing the state to the child components as well as the view; The view triggers actions; Actions can update the state; State updates are passed on to the view and the child components. Compared to the two-way data binding, the one implemented by React.JS is not as error-prone (a developer controls data to a larger extent), more comfortable to test and debug due to a clearly defined structure. 5. React Developer Tools React.JS developers get to benefit from a wide toolkit that covers all facets of the application performance. There’s a wide array of debugging and design solutions, including a life-saving React Developer Tools extension for Chrome and Firefox. Using this and other tools, you can define child and parent components, examine their state, observe hierarchies, and inspect props. Advantages of React.js React.JS helps developers systemize the interfaces of their projects by introducing the ‘components’ structure. The library allows the creation of modular views that consist of reusable blocks - pop-ups, tables, etc. One of the most significant advantages of using React.js is the way it improves user experience. A textbook example of library usage on Facebook is the possibility to see the changing number of likes in real-time without reloading the page. Originally, React.JS was released back in 2011 by a Facebook engineer as a way to upscale and maintain the complex interface of the Facebook Ads app. The library’s high functionality resulted in its adoption by other SMEs and large corporations - now React JS is one of the most widely used development tools. How to Use React.JS? Depending on your HTML and JavaScript proficiency, it may take anywhere from a few days to months to get the hang of React. For the basic understanding of the library, take a look at React.JS features as well as the setup process. Getting started with React.JS To start working with React, a developer has to import React and React to DOM libraries using a basic HTML file. Now that you have set up a working space, take your time to examine the defining features of React.JS. Components All React.JS elements are components. Depending on the syntax, they are grouped into the class and functional ones. As, in most cases, both lead to equal outcomes, a React.JS beginner should start by learning functional components. Props Props are the way for React.JS developers to pass data from parent to child structures. Keep in mind that, unlike states, props are immutable under any circumstances. They provide developers with high code reusability as the same message will be displayed on all pages. At times, developers do want components to change themselves. That’s when states come in handy. States States are used when a developer wants the application data to change. The most common operations that have to do with states include: Initialization; Modification; Adding event handlers. These were the basic concepts a React.JS developer has to be familiar with to get the most out of the library. React.JS best practices If you’re already using React.JS, be sure to make the most out of it. Keep track of new trends and best practices in all facets of app management - accessibility, performance, security, and others. Here’s a short collection of React.JS development secret tips that’ll improve the maintenance and development efficiency. Performance: Consider using React.Fragment to avoid extra DOM nodes. To load components on-demand, use React.Lazy, along with React.Suspense. Another popular practice among JS developers is taking advantage of shouldComponentUpdate to avoid unnecessary rendering. Try to keep the JS code as clean as possible. For instance, delete the DOM components you don’t use with ComponentDidUnomunt (). For component caching, use React.Memo. Accessibility Pay attention to the casing and reserved word differences in HTML and React.js to avoid bottlenecks. To set up page titles, use the react-handle plugin to set up page titles. Don’t forget to put ALT-tags for any non-text content. Use ref() functions to pinpoint the focus on a given component. External tools like ESLint plugin help developers monitor accessibility. Debugging Use Chrome Dev Tools - there are dozens of features - reduct logger, error messages handler, and so on. Leave the console open while coding to detect errors faster. To have a better understanding of the code you’re dealing with, adopt a table view for objects. Other quick debugging hacks include marking DOM items to find them quickly in a Google Chrome Inspector. View full stack traces for functions. The bottom line Thanks to a powerful team of engineers at work, React.JS has quickly become a powerhouse for front end development. Its huge reliance on JavaScript makes a library easier to get to know. While React.JS pros and cons are extensive - however, the possibility to express UIs declaratively along with the promotion of functional components makes it a favorite framework for many. A wide variety of the projects it empowers and a large number of job openings prove that knowing React is no longer optional for developers. The good news, there’s no lack of learning tools and resources online. Take your time to explore the library - you’ll be amazed by the order and efficiency React brings to applications. Author Bio Anastasia Stefanuk is a passionate writer and a marketing manager at Mobilunity. The company provides professional staffing services, so she is always aware of technology news and wants to share her experience to help tech startups and companies to be up-to-date.   Getting started with React Hooks by building a counter with useState and useEffect React 16.9 releases with an asynchronous testing utility, programmatic Profiler, and more 5 Reasons to Learn ReactJS
Read more
  • 0
  • 0
  • 47452

article-image-7-best-practices-for-logging-in-node-js
Guest Contributor
05 Mar 2019
5 min read
Save for later

7 Best Practices for Logging in Node.js

Guest Contributor
05 Mar 2019
5 min read
Node.js is one of the easiest platforms for prototyping and agile development. It’s used by large companies looking to scale their products quickly. However, using a platform on its own isn’t enough for most big projects today. Logging is also a key part of ensuring your web or mobile app runs smoothly for all users. Application logging is the practice of recording information about your application’s runtime. These files are usually saved a logging platform which helps identify potential problems. While no app is perfect 100% of the time, logging helps developers cut down on errors and even cyber attacks. The nature of software is complex. We can’t always predict how an application will react to data, errors, or system changes. Logging helps us better understand our own programs. So how do you handle logging in Node.js specifically? Following are some of the best practices for logging in Node.js to get the best results. 1. Understand the Regulations Let’s discuss the current legal regulations about what you can and cannot log. You should never log sensitive information or personal data. That means excluding credentials like passwords, credit card number or even email addresses. Recent changes to regulation like Europe’s GDPR make this even more essential. You don’t want to get tied up in the legal red tape of sensitive data. When in doubt, stick to the 3 things that are needed for a solid log message: timestamp, log level, and description. Beyond this, you don’t need any extensive framework. 2. Take advantage of Winston Node.js is built with a logging framework known as Winston. Winston is defined as transport for your logs, and you can install it directly into your application. Follow this guide to install Winston on your own. Winston is a powerful tool that comes with different logging levels with values. You can fully customize this console as well with colors, messages, and output details. The most recent version available is 3.0.0, but always make sure you have the latest edition to keep your app running smoothly. 3. Add Morgan In addition to Winston, Morgan is an HTTP request logger that collects server logs and standardizes them. Think of it as a logger simplification. Morgan. While you’re free to use Morgan on its own, most developers choose to use it with Winston since they make a powerful team. Morgan also works well with express.js. 4. Consider the Intel Package While Winston and Morgan are a great combination, they’re not your only option. Intel is another package solution with similar features as well as unique options. While you’ll see a lot of overlap in what they offer, Intel also includes a stack trace object. These features will come in handy when it’s time to actually debug. Because it gives a stack trace as a JSON object, it’s much easier to pass messages up the logger chain. Think of Intel like the breadcrumbs taking your developers to the error. 5. Use Environment Variables You’ll hear a lot of discussion about configuration management in the Node.js world. Decoupling your code from services and database is no straightforward process. In Node.js, it’s best to use environment variables. You can also look up values from process.env within your code. To determine which environment your program is running on, look up the NODE_ENV variables. You can also use the nconf module found here. 6. Choose a Style Guide No developer wants to spend time reading through lines of code only to have to change the spaces to tabs, reformat the braces, etc. Style guides are a must, especially when logging on Node.js. If you’re working with a team of developers, it’s time to decide on a team style guide that everyone sticks to across the board. When the code is written in a consistent style, you don’t have to worry about opinionated developers fighting for a say. It doesn’t matter which style you stick with, just make sure you can actually stick to it. The Googe style guide for Java is a great place to start if you can’t make a single decision. 7. Deal with Errors Finally, accept that errors will happen and prepare for them. You don’t want an error to bring down your entire software or program. Exception management is key. Use an asyn structure to cleanly handle any errors. Whether the app simply restarts or moves on to the next stage, make sure something happens. Users need their errors to be handled. As you can see, there are a few best practices to keep in mind when logging in Node.js. Don’t rely on your developers alone to debug the platform. Set a structure in place to handle these problems as they arise. Your users expect quality experience every time. Make sure you can deliver with these tips above. Author Bio Ashley Lipman Content marketing specialist Ashley is an award-winning writer who discovered her passion for providing creative solutions for building brands online. Since her first high school award in Creative Writing, she continues to deliver awesome content through various niches. Introducing Zero Server, a zero-configuration server for React, Node.js, HTML, and Markdown 5 reasons you should learn Node.js Deploying Node.js apps on Google App Engine is now easy
Read more
  • 0
  • 0
  • 47394

article-image-bridging-gap-between-data-science-and-devops
Richard Gall
23 Mar 2016
5 min read
Save for later

Bridging the gap between data science and DevOps with DataOps

Richard Gall
23 Mar 2016
5 min read
What’s the real value of data science? Hailed as the sexiest job of the 21st century just a few years ago, there are rumors that it’s not quite proving its worth. Gianmario Spacagna, a data scientist for Barclays bank in London, told Computing magazine at Spark Summit Europe in October 2015 that, in many instances, there’s not enough impact from data science teams – “It’s not a playground. It’s not academic” he said. His solution sounds simple. We need to build a bridge between data science and DevOps - and DataOps is perhaps the answer. He says: "If you're a start-up, the smartest person you want to hire is your DevOps guy, not a data scientist. And you need engineers, machine learning specialists, mathematicians, statisticians, agile experts. You need to cover everything otherwise you have a very hard time to actually create proper applications that bring value." This idea makes a lot of sense. It’s become clear over the past few years that ‘data’ itself isn’t enough; it might even be distracting for some organizations. Sometimes too much time is spent in spreadsheets and not enough time is spent actually doing stuff. Making decisions, building relationships, building things – that’s where real value comes from. What Spacagna has identified is ultimately a strategic flaw within how data science is used in many organizations. There’s often too much focus on what data we have and what we can get, rather than who can access it and what they can do with it. If data science isn’t joining the dots, DevOps can help. True, a large part of the problem is strategic, but DevOps engineers can also provide practical solutions by building dashboards and creating APIs. These sort of things immediately give data additional value by making they make it more accessible and, put simply, more usable. Even for a modest medium sized business, data scientists and analysts will have minimal impact if they are not successfully integrated into the wider culture. While it’s true that many organizations still struggle with this, Airbnb demonstrate how to do it incredibly effectively. Take a look at their Airbnb Engineering and Data Science publication on Medium. In this post, they talk about the importance of scaling knowledge effectively. Although they don’t specifically refer to DevOps, it’s clear that DevOps thinking has informed their approach. In the products they’ve built to scale knowledge, for example, the team demonstrate a very real concern for accessibility and efficiency. What they build is created so people can do exactly what they want and get what they need from data. It’s a form of strict discipline that is underpinned by a desire for greater freedom. If you keep reading Airbnb’s publication, another aspect of ‘DevOps thinking’ emerges: a relentless focus on customer experience. By this, I don’t simply mean that the work done by the Airbnb engineers is specifically informed by a desire to improve customer experiences; that’s obvious. Instead, it’s the sense that tools through which internal collaboration and decision making take place should actually be similar to a customer experience. They need to be elegant, engaging, and intuitive. This doesn’t mean seeing every relationship as purely transactional, based on some perverse logic of self-interest, but rather having a deeper respect for how people interact and share ideas. If DevOps is an agile methodology that bridges the gap between development and operations, it can also help to bridge the gap between data and operations. DataOps - bringing DevOps and data science together This isn’t a new idea. As much as I’d like to, I can’t claim credit for inventing ‘DataOps’. But there’s not really much point in asserting that distinction. DataOps is simply another buzzword for the managerial class. And while some buzzwords have value, I’m not so sure that we need another one. More importantly, why create another gap between Data and Development? That gap doesn’t make sense in the world we’re building with software today. Even for web developers and designers, the products they are creating are so driven by data that separating the data from the dev is absurd. Perhaps then, it’s not enough to just ask more from our data science as Gianmario Spacagna does. DevOps offers a solution, but we’re going to miss out on the bigger picture if we start asking for more DevOps engineers and some space for them to sit next to the data team. We also need to ask how data science can inform DevOps too. It’s about opening up a dialogue between these different elements. While DevOps evangelists might argue that DevOps has already started that, the way forward is to push for more dialogue, more integration and more collaboration. As we look towards the future, with the API economy becoming more and more important to the success of both startups and huge corporations, the relationships between all these different areas are going to become more and more complex. If we want to build better and build smarter we’re going to have to talk more. DevOps and DataOps both offer us a good place to start the conversation, but it’s important to remember it’s just the start.
Read more
  • 0
  • 0
  • 47199
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at €18.99/month. Cancel anytime
article-image-best-game-engines-for-ai-game-development
Natasha Mathur
24 Aug 2018
8 min read
Save for later

Best game engines for Artificial Intelligence game development

Natasha Mathur
24 Aug 2018
8 min read
"A computer would deserve to be called intelligent if it could deceive a human into believing that it was human" — Alan Turing It is quite common to find games which are initially exciting but take a boring turn eventually, making you want to quit the game. Then, there are games which are too difficult to hold your interest and you end up quitting in the beginning phase itself.  These are also two of the most common problems that game developers face when building games. This is where AI comes to your rescue, to spice things up. Why use Artificial Intelligence in games? The major reason for using AI in games is to provide a challenging opponent to make the game more fun to play. But, AI in the gaming industry is not a recent news. The gaming world has been leveraging the wonders of AI for a long time now. One of the first examples of AI is the computerized game, Nim, was created back in 1951. Other games such as Façade, Black & White, The Sims, Versu, and F.E.A.R. are all great AI games, that hit the market long time back. Even modern-day games like Need for Speed, Civilization, or Counter-Strike use AI. AI controls a lot of elements in games and is usually behind characters such as enemy creeps, neutral merchants, or even animals. AI in games is used to enable the non-human characters (NPCs) with responsive, adaptive, and intelligent behaviors similar to human-like intelligence. AI helps make NPCs seem intelligent as they are able to actively change their level of skills based on the person playing the game. This makes the game seem more personalized to the gamer. Playing video games is fun, and developing these games is equally fun. There are different game engines on the market to help with the development of games. A game engine is a software that provides game creators with the necessary set of features to build games quickly and efficiently. Let’s have a look at the top game engines for Artificial Intelligence game development. Unity3D Developer:  Unity Technologies Release Date: June 8, 2005 Unity is a cross-platform game engine which provides users with the ability to create games in both 2D and 3D. It is extremely popular and loved by game designers from large and small studios alike. Apart from 3D, and 2D games, it also helps with simulations for desktops, laptops, home consoles, smart TVs, and mobile devices. Key AI features: Unity offers a machine learning agents toolkit to the game developers, which help them include AI agents within games. As per the Unity team, “machine Learning Agents Toolkit (ML-Agents) is an open-source Unity plugin that enables games and simulations to serve as environments for training intelligent agents”. Unity AI - Unity 3D Artificial Intelligence  The ML-Agents SDK transforms games and simulations created using the Unity Editor into environments for training intelligent agents. These ML agents are trained using deep Reinforcement Learning, imitation learning, neuroevolution, or other machine learning methods via Python APIs. There’s also a TensorFlow based algorithm provided by Unity to allow game developers to easily train intelligent agents for 2D, 3D, and VR/AR games. These trained agents are then used for controlling the NPC behavior within games. The ML-Agents toolkit is beneficial for both game developers and AI researchers. Apart from this, Unity3D is easy to use and learn, compatible with every game platform and provides great community support. Learning Resources: Unity AI Programming Essentials Unity 2017 Game AI programming - Third Edition Unity 5.x Game AI Programming Cookbook Unreal Engine 4 Developer: Epic games Release Date: May 1998 Unreal Engine is widely used among developers all around the world. It is a collection of integrated tools for game developers which helps them build games, simulations, and visualization. It is also among the top game engines which are used to develop high-end AAA titles. Gears of War, Batman: Arkham Asylum and Mass Effect are some of the popular games developed using Unreal Engine. Key AI features: Unreal Engine uses a set of tools which helps add AI capabilities to a game. It uses tools such as behavior Tree, navigation Component, blackboard asset, enumeration, target point, AI Controller, and Navigation Volumes. Behavior tree creates different states and the logic behind AI. Navigation Component helps with handling movement for AI. Blackboard Asset stores information and acts as the local variable for AI. Enumeration creates states. It also allows alternating between these states. Target Point creates a basic Path node form. The AI Controller and Character tool is responsible for handling the communication between the world and the controlled pawn for AI. At last, the Navigation Volumes feature creates Navigation Mesh in the environment to allow easy Pathfinding for AI. There are also features such as Blueprint Visual Scripting which can be converted into performant C++ code, AIComponents, and the Environment Query System (EQS) which provides agents the ability to perceive their environment. Apart from its AI capabilities, the Unreal engine offers the largest community support with lifetime hours of video tutorials and assets. It is also compatible with a variety of operating platforms such as iOS, Android, Linux, Mac, Windows, and most game consoles. But there are certain inbuilt-tools in Unreal Engine which can be hard for beginners to learn. Learning resources: Unreal Engine 4 AI programming essentials CryEngine 3 Developer: Crytek Release Date: May 2, 2002 CryEngine is a powerful game development platform that comes packed with a set of tools and features to create world-class gaming experiences. It is the game engine behind games such as Sniper: Ghost Warrior 2, SNOW, etc. Key AI features: CryEngine comes with an AI system designed for easy creation of custom AI actors. This is flexible enough to handle a larger set of complex and different worlds. The core of CryEngine’s AI system is based on lots of scripting. There are different AI elements within this system that add the AI capabilities to the NPCs within the game. Some of these elements are AI Actions which allows the developers to script AI behaviors without creating new code. AI Actors Logger can log AI events and signals to files. AI Control Objects use AI object to control AI entities/actors. AI Debug Draw is the primary tool offered by CryEngine for information on the current state of the AI System and AI actors. AI Debugger registers the inputs that AI agents receive and the decisions that they make in real-time during a game session. AI Sequence system works in parallel to FG and AI systems. This helps to simplify and group AI control. CryEngine offers the easiest A.l. coding of any tech currently on the market. Since CryEngine is relatively new as compared to other game engines, it does not have a very flourishing community yet. Despite the easy AI coding, the overall learning curve of Unreal Engine is high. Panda3D Developer: Disney Interactive until 2010,  Walt Disney Imagineering, Carnegie Mellon University Release Date: 2002 Panda3D is a game engine, a framework for 3D rendering and game development for Python and C++ programs. It includes graphics, audio, I/O, collision detection, and other abilities for the creation of 3D games. Key AI features: Panda3D comes packed with an AI library named PandAI v1.0. PandAI is an AI library which provides 'Artificially Intelligent' behavior in NPC (Non-Playable Characters) in games. The PandAI library offers functionality for Steering Behaviors (Seek, Flee, Pursue, Evade, Wander, Flock, Obstacle Avoidance, Path Following) and path finding (helps the NPCs to intelligently avoiding obstacles via the shortest path ). This AI library is composed of several different entities. For instance, there’s a main AIWorld Class to update any AICharacters added to it. Each AICharacter has its own AIBehavior object for tracking all the position and rotation updates. Each AIBehavior object has the functionality to implement all the steering behaviors and pathfinding behaviors. These features within Panda3D gives you the ability to call the respective functions. Panda3D is a relatively simple game engine which lets you add AI capabilities within your games. The community is not as robust and has a low learning curve. AI is a fantastic tool which makes the entities in games seem more organic, alive, and real. The main goal here is not to copy the entire human thought process but to just sell the illusion of life. These game engines provide the developers with the entire framework needed to add AI capabilities to their games. The entire game development process is more fun as there is no need to create all systems including the physics, graphics, and AI, from scratch. Now, if you’re wondering about the best AI game engines out of the four mentioned in this article then there is no specific answer to that as selecting the best AI game engine depends on the requirements of your project. Game Engine Wars: Unity vs Unreal Engine Unity switches to WebAssembly as the output format for the Unity WebGL build target Developing Games Using AI  
Read more
  • 0
  • 1
  • 47036

article-image-famous-gang-of-four-design-patterns
Sugandha Lahoti
10 Jul 2018
14 min read
Save for later

Meet the famous 'Gang of Four' design patterns

Sugandha Lahoti
10 Jul 2018
14 min read
A design pattern is a reusable solution to a recurring problem in software design. It is not a finished piece of code but a template that helps to solve a particular problem or family of problems. In this article, we will talk about the Gang of Four design patterns. The gang of four, authors Erich Gamma, Richard Helm, Ralph Johnson and John Vlissides, initiated the concept of Design Pattern in Software development. These authors are collectively known as Gang of Four (GOF). We are going to focus on the design patterns from the Scala point of view. All different design patterns can be grouped into the following types: Creational Structural Behavioral These three groups contain the famous Gang of Four design patterns.  In the next few subsections, we will explain the main characteristics of the listed groups and briefly present the actual design patterns that fall under them. This article is an excerpt from Scala Design Patterns - Second Edition by Ivan Nikolov. In this book, you will learn how to write efficient, clean, and reusable code with Scala. Creational design patterns The creational design patterns deal with object creation mechanisms. Their purpose is to create objects in a way that is suitable to the current situation, which could lead to unnecessary complexity and the need for extra knowledge if they were not there. The main ideas behind the creational design patterns are as follows: Knowledge encapsulation about the concrete classes Hiding details about the actual creation and how objects are combined We will be focusing on the following creational design patterns in this article: The abstract factory design pattern The factory method design pattern The lazy initialization design pattern The singleton design pattern The object pool design pattern The builder design pattern The prototype design pattern The following few sections give a brief definition of what these patterns are. The abstract factory design pattern This is used to encapsulate a group of individual factories that have a common theme. When used, the developer creates a specific implementation of the abstract factory and uses its methods in the same way as in the factory design pattern to create objects. It can be thought of as another layer of abstraction that helps to instantiate classes. The factory method design pattern This design pattern deals with the creation of objects without explicitly specifying the actual class that the instance will have—it could be something that is decided at runtime based on many factors. Some of these factors can include operating systems, different data types, or input parameters. It gives developers the peace of mind of just calling a method rather than invoking a concrete constructor. The lazy initialization design pattern This design pattern is an approach to delay the creation of an object or the evaluation of a value until the first time it is needed. It is much more simplified in Scala than it is in an object-oriented language such as Java. The singleton design pattern This design pattern restricts the creation of a specific class to just one object. If more than one class in the application tries to use such an instance, then this same instance is returned for everyone. This is another design pattern that can be easily achieved with the use of basic Scala features. The object pool design pattern This design pattern uses a pool of objects that are already instantiated and ready for use. Whenever someone requires an object from the pool, it is returned, and after the user is finished with it, it puts it back into the pool manually or automatically. A common use for pools are database connections, which generally are expensive to create; hence, they are created once and then served to the application on request. The builder design pattern The builder design pattern is extremely useful for objects with many possible constructor parameters that would otherwise require developers to create many overrides for the different scenarios an object could be created in. This is different to the factory design pattern, which aims to enable polymorphism. Many of the modern libraries today employ this design pattern. As we will see later, Scala can achieve this pattern really easily. The prototype design pattern This design pattern allows object creation using a clone() method from an already created instance. It can be used in cases when a specific resource is expensive to create or when the abstract factory pattern is not desired. Structural design patterns Structural design patterns exist in order to help establish the relationships between different entities in order to form larger structures. They define how each component should be structured so that it has very flexible interconnecting modules that can work together in a larger system. The main features of structural design patterns include the following: The use of the composition to combine the implementations of multiple objects Help build a large system made of various components by maintaining a high level of flexibility In this article, we will focus on the following structural design patterns: The adapter design pattern The decorator design pattern The bridge design pattern The composite design pattern The facade design pattern The flyweight design pattern The proxy design pattern The next subsections will put some light on what these patterns are about. The adapter design pattern The adapter design pattern allows the interface of an existing class to be used from another interface. Imagine that there is a client who expects your class to expose a doWork() method. You might have the implementation ready in another class, but the method is called differently and is incompatible. It might require extra parameters too. This could also be a library that the developer doesn't have access to for modifications. This is where the adapter can help by wrapping the functionality and exposing the required methods. The adapter is useful for integrating the existing components. In Scala, the adapter design pattern can be easily achieved using implicit classes. The decorator design pattern Decorators are a flexible alternative to sub classing. They allow developers to extend the functionality of an object without affecting other instances of the same class. This is achieved by wrapping an object of the extended class into one that extends the same class and overrides the methods whose functionality is supposed to be changed. Decorators in Scala can be built much more easily using another design pattern called stackable traits. The bridge design pattern The purpose of the bridge design pattern is to decouple an abstraction from its implementation so that the two can vary independently. It is useful when the class and its functionality vary a lot. The bridge reminds us of the adapter pattern, but the difference is that the adapter pattern is used when something is already there and you cannot change it, while the bridge design pattern is used when things are being built. It helps us to avoid ending up with multiple concrete classes that will be exposed to the client. You will get a clearer understanding when we delve deeper into the topic, but for now, let's imagine that we want to have a FileReader class that supports multiple different platforms. The bridge will help us end up with FileReader, which will use a different implementation, depending on the platform. In Scala, we can use self-types in order to implement a bridge design pattern. The composite design pattern The composite is a partitioning design pattern that represents a group of objects that are to be treated as only one object. It allows developers to treat individual objects and compositions uniformly and to build complex hierarchies without complicating the source code. An example of composite could be a tree structure where a node can contain other nodes, and so on. The facade design pattern The purpose of the facade design pattern is to hide the complexity of a system and its implementation details by providing the client with a simpler interface to use. This also helps to make the code more readable and to reduce the dependencies of the outside code. It works as a wrapper around the system that is being simplified and, of course, it can be used in conjunction with some of the other design patterns mentioned previously. The flyweight design pattern The flyweight design pattern provides an object that is used to minimize memory usage by sharing it throughout the application. This object should contain as much data as possible. A common example given is a word processor, where each character's graphical representation is shared with the other same characters. The local information then is only the position of the character, which is stored internally. The proxy design pattern The proxy design pattern allows developers to provide an interface to other objects by wrapping them. They can also provide additional functionality, for example, security or thread-safety. Proxies can be used together with the flyweight pattern, where the references to shared objects are wrapped inside proxy objects. Behavioral design patterns Behavioral design patterns increase communication flexibility between objects based on the specific ways they interact with each other. Here, creational patterns mostly describe a moment in time during creation, structural patterns describe a more or less static structure, and behavioral patterns describe a process or flow. They simplify this flow and make it more understandable. The main features of behavioral design patterns are as follows: What is being described is a process or flow The flows are simplified and made understandable They accomplish tasks that would be difficult or impossible to achieve with objects In this article, we will focus our attention on the following behavioral design patterns: The value object design pattern The null object design pattern The strategy design pattern The command design pattern The chain of responsibility design pattern The interpreter design pattern The iterator design pattern The mediator design pattern The memento design pattern The observer design pattern The state design pattern The template method design pattern The visitor design pattern The following subsections will give brief definitions of the aforementioned behavioral design patterns. The value object design pattern Value objects are immutable and their equality is based not on their identity, but on their fields being equal. They can be used as data transfer objects, and they can represent dates, colors, money amounts, numbers, and more. Their immutability makes them really useful in multithreaded programming. The Scala programming language promotes immutability, and value objects are something that naturally occur there. The null object design pattern Null objects represent the absence of a value and they define a neutral behavior. This approach removes the need to check for null references and makes the code much more concise. Scala adds the concept of optional values, which can replace this pattern completely. The strategy design pattern The strategy design pattern allows algorithms to be selected at runtime. It defines a family of interchangeable encapsulated algorithms and exposes a common interface to the client. Which algorithm is chosen could depend on various factors that are determined while the application runs. In Scala, we can simply pass a function as a parameter to a method, and depending on the function, a different action will be performed. The command design pattern This design pattern represents an object that is used to store information about an action that needs to be triggered at a later time. The information includes the following: The method name The owner of the method Parameter values The client then decides which commands need to be executed and when by the invoker. This design pattern can easily be implemented in Scala using the by-name parameters feature of the language. The chain of responsibility design pattern The chain of responsibility is a design pattern where the sender of a request is decoupled from its receiver. This way, it makes it possible for multiple objects to handle the request and to keep logic nicely separated. The receivers form a chain where they pass the request and, if possible, they process it, and if not, they pass it to the next receiver. There are variations where a handler might dispatch the request to multiple other handlers at the same time. This somehow reminds us of function composition, which in Scala can be achieved using the stackable traits design pattern. The interpreter design pattern The interpreter design pattern is based on the ability to characterize a well-known domain with a language with a strict grammar. It defines classes for each grammar rule in order to interpret sentences in the given language. These classes are likely to represent hierarchies as grammar is usually hierarchical as well. Interpreters can be used in different parsers, for example, SQL or other languages. The iterator design pattern The iterator design pattern is when an iterator is used to traverse a container and access its elements. It helps to decouple containers from the algorithms performed on them. What an iterator should provide is sequential access to the elements of an aggregate object without exposing the internal representation of the iterated collection. The mediator design pattern This pattern encapsulates the communication between different classes in an application. Instead of interacting directly with each other, objects communicate through the mediator, which reduces the dependencies between them, lowers the coupling, and makes the overall application easier to read and maintain. The memento design pattern This pattern provides the ability to roll back an object to its previous state. It is implemented with three objects—originator, caretaker, and memento. The originator is the object with the internal state; the caretaker will modify the originator, and a memento is an object that contains the state that the originator returns. The originator knows how to handle a memento in order to restore its previous state. The observer design pattern This design pattern allows the creation of publish/subscribe systems. There is a special object called subject that automatically notifies all the observers when there are any changes in the state. This design pattern is popular in various GUI toolkits and generally where event handling is needed. It is also related to reactive programming, which is enabled by libraries such as Akka. We will see an example of this towards the end of this book. The state design pattern This design pattern is similar to the strategy design pattern, and it uses a state object to encapsulate different behavior for the same object. It improves the code's readability and maintainability by avoiding the use of large conditional statements. The template method design pattern This design pattern defines the skeleton of an algorithm in a method and then passes some of the actual steps to the subclasses. It allows developers to alter some of the steps of an algorithm without having to modify its structure. An example of this could be a method in an abstract class that calls other abstract methods, which will be defined in the children. The visitor design pattern The visitor design pattern represents an operation to be performed on the elements of an object structure. It allows developers to define a new operation without changing the original classes. Scala can minimize the verbosity of this pattern compared to the pure object-oriented way of implementing it by passing functions to methods. Choosing a design pattern As we already saw, there are a huge number of design patterns. In many cases, they are suitable to be used in combinations as well. Unfortunately, there is no definite answer regarding how to choose the concept of designing our code. There are many factors that could affect the final decision, and you should ask yourselves the following questions: Is this piece of code going to be fairly static or will it change in the future? Do we have to dynamically decide what algorithms to use? Is our code going to be used by others? Do we have an agreed interface? What libraries are we planning to use, if any? Are there any special performance requirements or limitations? This is by no means an exhaustive list of questions. There is a huge amount of factors that could dictate our decision in how we build our systems. It is, however, really important to have a clear specification, and if something seems missing, it should always be checked first. By now, we have a fair idea about what a design pattern is and how it can affect the way we write our code. We've iterated through the most famous Gang of Four design patterns out there, and we have outlined the main differences between them. To know more on how to incorporate functional patterns effectively in real-life applications, read our book Scala Design Patterns - Second Edition. Implementing 5 Common Design Patterns in JavaScript (ES8) An Introduction to Node.js Design Patterns
Read more
  • 0
  • 0
  • 46821

article-image-understanding-the-role-aiops-plays-in-the-present-day-it-environment
Guest Contributor
17 Dec 2019
7 min read
Save for later

Understanding the role AIOps plays in the present-day IT environment

Guest Contributor
17 Dec 2019
7 min read
In most conversations surrounding cybersecurity these days, the term “digital transformation,” gets frequently thrown in the mix, especially when the discussion revolves around AIOps. If you’ve got the slightest bit of interest in any recent developments in the cybersecurity world, you might have an idea of what AIOps is. However, if you didn’t already know- AIOps refers to a multi-layered, modern technology platform that allows enterprises to maximize IT operations by integrating AI and machine learning to detect and solve cybersecurity issues as they occur. As the name suggests, AIOps makes use of essential AI technology such as machine learning for the overall improvement of an organization’s IT operations. However, today- the role that AIOps plays has shifted dramatically- which leaves a lot of room for confusion to harbor amongst cybersecurity officers since most enterprises prefer to take the more conventional route as far as AI application is concerned. To utilize the most out of AIOps, enterprises need to understand the significance of the changes in the present-day IT environment, and how those changes influence the AI’s applications. To aid readers in understanding the volatility of the relationship between AI’s applications and the IT environment it is applicable in, we’ve put together an article that dives into the differences between conventional monitoring methods and present-day enterprise needs. Moreover, we’ll also be shining a light on the importance of the adoption of AIOps in enterprises as well. How has the IT environment changed in the modern times? Before we can get into every nook and cranny of why the transition from a traditional approach to a more modern approach matters, we’d like to make one thing very clear. Just because a specific approach works for one organization in no way guarantees that it would work for you. Perhaps the greatest advice any business owner could receive is to plan according to the specific requirements of their security and IT infrastructure. The greatest shortcoming of many CISOs and CSOs is that they fail to understand the particular needs of their IT environment and rely on conventional applications of AI to maximize the overall IT experience. Speaking of traditional AIOps applications, since the number of ‘moving’ parts or components involved was significantly less in number- the involvement of AI was far less complex, and therefore much easier to monitor and control. In a more modern setting, however, with the wave of digitalization and the ever-growing reliance that enterprises have on cloud computing systems, the number of components involved has increased, which also makes understanding the web much more difficult. Bearing witness to the ever-evolving and complex nature of today’s IT environment are the results of the research conducted by Dynatrace. The results explicitly state that something as simple as a single web or mobile application transaction can involve a staggering number of 37 different components or technologies on average. Taking this into account, relying on a traditional approach to AI becomes redundant, and ineffective since the conventional approach relies on an extremely limited understanding and fails to make sense of all the information provided by an arsenal of tools and dashboards. Not only is the conventional approach to AIOps impractical within the modern IT context, but it is also extremely outdated. Having said that perhaps the only approach that fits in the modern-day IT environment is a software intelligence-centric approach, which allows for fast-paced and robust solutions to present-day IT complexities. How important is AIOps for enterprises today? As we’ve already mentioned above, the present-day IT infrastructure requires a drastic change in the relationship that enterprises have had with AIOps so far. For starters, enterprises and organizations need to realize the importance of the role that AIOps plays. Unfortunately, however, there’s an overarching tendency seen in enterprises that enables them the naivety of labeling investing in AIOps as yet another “IT expense.” On the contrary, AIOps is essential for companies and organizations today, since every company is undergoing digitalization, along with increasing their reliance on modern technology more and more. Some cybersecurity specialists might even argue that each company is slowly turning into a software company, primarily because of the rise in cloud-computing systems. AIOps also works on improving the ‘business’ aspect of an enterprise, since the modern consumer looks for enterprises that offer innovative features, along with their ability to enhance user experience through an impeccable and seamless digital experience. Furthermore, in the competitive economic conditions of today, carrying out business operations in a timely manner is critical to an enterprise’s longevity- which is where the integration of AI can help an organization function smoothly. It should also be pointed out that the employment of AIOps opens up new avenues for businesses to step into since it removes the element of fear present in most business owners. The implementation of AIOps also enables an organization to make quick-paced releases since it takes IT problems out of the equation. These problems usually consist of bugs, regulation, and compliance, along with monitoring the overall IT experience being provided to consumers. How can enterprises ensure the longevity of their reliance on AIOps? When it comes to the integration of any new technology into an organization’s routine functions, there are always questions to be asked regarding the impact of the continued reliance on modern technology. To demonstrate the point we’ve made above, let’s return to a tech we’ve referred to throughout the article- cloud computing. Introduced in the 1960s, cloud computing revolutionized data storage to what it is today. However, after a couple of years and some unfortunate cyberattacks launched on cloud storage networks, cybersecurity specialists have found some dire problems with complete dependency on cloud computing storage. Similarly, many cybersecurity specialists and researchers wonder about the negative impacts that a dependency on AIOps could have in the future. When it comes to ensuring enterprises about the longevity of amalgamating AIOps into an enterprise, we’d like to give our assurance through the following reasons: Unlike cloud computing, developments in AIOps are heavily rooted in real-time data fed to the algorithm by an IT team. When you strip down all the fancy IT jargon, the only thing identity you need to trust is that of your IT personnel. Since AIOps relies on smart auto-remediation capabilities, business owners can see an immediate response geared by the employed algorithms. One such way that AIOps deploys auto-remediation strategies is by sending out alerts of any possible issue- the practice of which enables businesses to operate on the “business” side of the spectrum since they’ve got a trustworthy agent to rely on. Conclusion At the end of the article, we can only reinstate what’s been said before, in a thousand different ways- it’s high time that enterprises welcome change in the form of AIOps, instead of resisting it. In the modern age of digitalization, the key differences seen in the modern-day IT landscape should be reason enough for enterprises to be on the lookout for new alternatives to securing their data, and by extension- their companies. Author Bio Rebecca James is an enthusiastic cybersecurity journalist. A creative team leader, editor of PrivacyCrypts. What is AIOps and why is it going to be important? 8 ways Artificial Intelligence can improve DevOps Post-production activities for ensuring and enhancing IT reliability [Tutorial]
Read more
  • 0
  • 0
  • 46758

article-image-top-10-computer-vision-tools
Aaron Lazar
05 Apr 2018
7 min read
Save for later

Top 10 Tools for Computer Vision

Aaron Lazar
05 Apr 2018
7 min read
The adoption of Computer Vision has been steadily picking up pace over the past decade, but there’s been a spike in adoption of various computer vision tools in recent times, thanks to its implementation in fields like IoT, manufacturing, healthcare, security, etc. Computer vision tools have evolved over the years, so much so that computer vision is now also being offered as a service. Moreover, the advancements in hardware like GPUs, as well as machine learning tools and frameworks make computer vision much more powerful in the present day. Major cloud service providers like Google, Microsoft and AWS have all joined the race towards being the developers’ choice. But which tool should you choose? Today I’ll take you through a list of the top tools and will help you understand which one to pick up, based on your need. Computer Vision Tools/Libraries OpenCV: Any post on computer vision is incomplete without the mention of OpenCV. OpenCV is a great performing computer vision tool and it works well with C++ as well as Python. OpenCV is prebuilt with all the necessary techniques and algorithms to perform several image and video processing tasks. It’s quite easy to use and this makes it clearly the most popular computer vision library on the planet! It is multi-platform, allowing you to build applications for Linux, Windows and Android. At the same time, it does have some drawbacks. It gets a bit slow when working through massive data sets or very large images. Moreover, on its own, it doesn’t have GPU support and relies on CUDA for GPU processing. Matlab: Matlab is a great tool for creating image processing applications and is widely used in research. The reason being that Matlab allows quick prototyping. Another interesting aspect is that Matlab code is quite concise, as compared to C++, making it easier to read and debug. It tackles errors before execution by proposing some ways to make the code faster. On the downside, Matlab is a paid tool. Also, it can get quite slow during execution time, if that’s something that concerns you much. Matlab is not your go to tool in an actual production environment, as it was basically built for prototyping and research. AForge.NET/Accord.NET: You’ll be excited to know that image processing is possible even if you’re a C# and .NET developer, thanks to AForge/Accord. It’s a great tool that has a lot of filters and is great for image manipulation and different transforms. The Image Processing Lab allows for filtering capabilities like edge detection and more. AForge is extremely simple to use as all you need to do is adjust parameters from a user interface. Moreover, its processing speeds are quite good. However, AForge doesn’t possess the power and capabilities of other tools like OpenCV, like advanced motion picture analysis or even advanced processing on images. TensorFlow: TensorFlow has been gaining popularity over the past couple of years, owing to its power and ease of use. It lets you bring the power of Deep Learning to computer vision and has some great tools to perform image processing/classification - it’s API-like graph tensor. Moreover, you can make use of the Python API to perform face and expression detection. You can also perform classification using techniques like regression. Tensorflow also allows you to perform computer vision of tremendous magnitudes. One of the main drawbacks of Tensorflow is that it’s extremely resource hungry and can devour a GPU’s capabilities in no time, quite uncalled for. Moreover, if you wanted to learn how to perform image processing with TensorFlow, you’d have to understand what Machine and Deep Learning is, write your own algorithms and then go forward from there. CUDA: CUDA is a platform for parallel computing, invented by NVIDIA. It enables great boosts in computing performance by leveraging the power of GPUs. The CUDA Toolkit includes the NVIDIA Performance Primitives library which is a collection of signal, image, and video processing functions. If you have large images to process, that are GPU intensive, you can choose to use CUDA. CUDA is easy to program and is quite efficient and fast. On the downside, it is extremely high on power consumption and you will find yourself reformulating for memory distribution in parallel tasks. SimpleCV: SimpleCV is a framework for building computer vision applications. It gives you access to a multitude of computer vision tools on the likes of OpenCV, pygame, etc. If you don’t want to get into the depths of image processing and just want to get your work done, this is the tool to get your hands on. If you want to do some quick prototyping, SimpleCV will serve you best. Although, if your intention is to use it in heavy production environments, you cannot expect it to perform on the level of OpenCV. Moreover, the community forum is not very active and you might find yourself running into walls, especially with the installation. GPUImage: GPUImage is a framework or rather, an iOS library that allows you to apply GPU-accelerated effects and filters to images, live motion video, and movies. It is built on OpenGL ES 2.0. Running custom filters on a GPU calls for a lot of code to set up and maintain. GPUImage cuts down on all of that boilerplate and gets the job done for you. Computer Vision as a Service: Google Cloud and Mobile Vision APIs: Google Cloud Vision API enables developers to perform image processing by encapsulating powerful machine learning models in a simple REST API that can be called in an application. Also, its Optical Character Recognition (OCR) functionality enables you to detect text in your images. The Mobile Vision API lets you detect objects in photos and video, using real-time on-device vision technology. It also lets you scan and recognise barcodes and text. Amazon Rekognition: Amazon Rekognition is a deep learning-based image and video analysis service that makes adding image and video analysis to your applications, a piece of cake. The service can identify objects, text, people, scenes and activities, and it can also detect inappropriate content, apart from providing highly accurate facial analysis and facial recognition for sentiment analysis. Microsoft Azure Computer Vision API: Microsoft’s API is quite similar to its peers and allows you to analyse images, read text in them, and analyse video in near-real time. You can also flag adult content, generate thumbnails of images and recognise handwriting. Bonus: SciPy and NumPy: I thought I’d add these in as well, since I’ve seen quite a few developers use Python to build computer vision applications (without OpenCV, that is). SciPy and NumPy are quite powerful enough to perform image processing. scikit-image is a Python package that is dedicated towards image processing, which uses native NumPy and SciPy arrays as image objects. Moreover, you get to use the cool IPython interactive computing environment and you can also choose to include OpenCV if you want to do some more hardcore image processing. Well there you have it, these were the top tools for computer vision and image processing. Head on over and check out these resources, to get working with some of the top tools used in the industry.
Read more
  • 0
  • 2
  • 46603
article-image-4-tips-for-learning-data-visualization-with-python
Sugandha Lahoti
01 Nov 2018
4 min read
Save for later

4 tips for learning Data Visualization with Python

Sugandha Lahoti
01 Nov 2018
4 min read
Data today is the world’s most important resource. However, without properly visualizing your data to discover meaningful insights, it’s useless. Creating visualizations helps in getting a clearer and concise view of the data, making it more tangible for (non-technical) audiences. Python is the choice of programming language for developers these days. However, sometimes developers face issues performing data visualization with Python. In this post, Tim Großmann, and Mario Döbler, the authors of the Data Visualization with Python course, discuss some of the best practices you should keep in mind while visualizing data with Python. #1 Start looking and experimenting with examples One of the most important ways to deeply understand and learn to use Python for data visualizations is to download example projects and play around with them. You should read their documentation and comments and change values, observing what influence it has. In many cases, they can even serve as a starting point to insert your own data. Think about how you could modify the given examples to visualize your own data. #2 Start from scratch and build on it Sometimes starting with an empty canvas is the best approach. Start with only the necessary components like your data and the import of your library of choice. This builds a nice flow and process that will enable you to debug problems with precision. Once you have gone through the whole process of building a simple visualization, you will have a good understanding of where an error might occur and how to fix it. Starting from scratch sometimes shows you that simpler solutions will save you a lot of time while still communicating the essence of your idea. #3 Make full use of documentation There are libraries with plenty of documentation to answer every single question you have. Make sure to make best use of it, research their API, look at the given example, and search for open issues on their GitHub pages when encountering a problem. Especially the libraries covered in the course “Data Visualization with Python” not only has extensive documentation, but also an active community that is constantly creating new questions on StackOverflow which will help you to find solutions to your problems in no time. #4 Use every opportunity you have with data to visualize it Every time you encounter new data take a few minutes and think about what information might be interesting and visualize it. Think back to the last time you had to give a presentation about your findings and all you had was a table with numerical values in it. For you it was understandable, but your colleagues sat there and scratched their heads. Try to create some simple visualizations that would have impressed the entire team with your results. Only practice makes you perfect. We hope that these tips will not only enable you to get better insights into your data but also gives you the tool to communicate results better. Don’t forget to checkout our course Data Visualization with Python to understand, explore, and effectively present data using the powerful data visualization techniques of Python. About the authors Tim Großmann is a CS student with interest in diverse topics ranging from AI to IoT. He previously worked at the Bosch Center for Artificial Intelligence in Silicon Valley in the field of big data engineering. He’s highly involved in different Open Source projects and actively speaks at meetups and conferences about his projects and experiences. Mario Döbler is a graduate student with a focus in deep learning and AI. He previously worked at the Bosch Center for Artificial Intelligence in Silicon Valley in the field of deep learning. Currently, he dedicates himself to apply deep learning to medical data to make health care accessible to everyone. 8 ways to improve your data visualizations Seaborn v0.9.0 brings better data visualization with new relational plots, theme updates, and more Getting started with Data Visualization in Tableau
Read more
  • 0
  • 0
  • 46578

article-image-what-you-need-to-know-about-generative-adversarial-networks
Guest Contributor
19 Jan 2018
7 min read
Save for later

What you need to know about Generative Adversarial Networks

Guest Contributor
19 Jan 2018
7 min read
[box type="note" align="" class="" width=""]We have come to you with another guest post by Indra den Bakker, an experienced deep learning engineer and a mentor on Udacity for many budding data scientists. Indra has also written one of our best selling titles, Python Deep Learning Cookbook which covers solutions to various problems in modeling deep neural networks.[/box] In 2014, we took a significant step in AI with the introduction of Generative Adversarial Networks -better known as GANs- by Ian Goodfellow, amongst others. The real breakthrough of GANs didn’t follow until 2016, however, the original paper includes many novel ideas that would be exploited in the years to come. Previously, deep learning had already revolutionized many industries by achieving above human performance. However, many critics argued that these deep learning models couldn’t compete with human creativity. With the introduction to GANs, Ian showed that these critics could be wrong. Figure 1: example of style transfer with deep learning The idea behind GANs is to create new examples based on a training set - for example to demonstrate the ability to create new paintings or new handwritten digits. In GANs two competing deep learning models are trained simultaneously. These networks compete against each other: one model tries to generate new realistic examples, this network is also called the generator. The other network tries to classify if an example originates from the training set or from the generator, also called as discriminator. In other words, the generator tries to mislead the discriminator by generating new examples. In the figure below we can see the general structure of GANs. Figure 2: GAN structure with X as training examples and Z as noise input. GANs are fundamentally different from other machine learning applications. The task of a GAN is unsupervised: we try to extract patterns and structure from data without additional information. Therefore, we don’t have a truth label. GANs shouldn’t be confused with autoencoder networks. With autoencoders we know what the output should be: the same as the input. But in case of GANs we try to create new examples that look like the training examples but are different. It’s a new way of teaching an agent to learn complex tasks by imitating an “expert”. If the generator is able to fool the discriminator one could argue that the agent mastered the task - think about the Turing test. Best way to explain GANs is to use images as an example. The resulting output of GANs can be fascinating. The most used dataset for GANs is the popular MNIST dataset. This dataset has been used in many deep learning papers, including the original Generative Adversarial Nets paper. Figure 3: example of MNIST training images Let’s say as input we have a bunch of handwritten digits. We want our model to be able to take these examples and create new handwritten digits. We want our model to learn how to write digits in such a way that it looks like handwritten digits. Note, that we don’t care which digits the model creates as long as it looks like one of the digits from 0 to 9. As you may suspect, there is a thin line between generating examples that are exact copies of the training set and newly created images. We need to make sure that the generator generates new images that follow the distribution of the training examples but are slightly different. This is where the creativity needs to come in. In Figure 2, we’ve showed that the generator uses noise -random values- as input. This noise is random, to make sure that the generator creates different output each time. Now that we know what we need and what we want to achieve, let’s have a closer look at both model architectures. Let’s start with the generator. We will feed the generator with random noise: a vector of 100 values randomly drawn between -1 and 1. Next, we stack multiple fully connected layers with Leaky ReLU activation function. Our training images are in grayscale and are sized as 28x28. Which means, flattened we need an output of 784 units for the final layer of our generator - the output of the generator should match the size of the training images. As activation function for our final layer we will be using TanH to make sure the resulting values are squeezed between -1 and 1. The final model architecture of our generator looks as follows: Figure 4: model architecture of the generator Next, we define our discriminator model. Most common is to use a mirrored version of the generator, where we have as input 784 values and as final layer a fully connected layer with 1 hidden neuron and sigmoid activation function for binary classification. Keep in mind that both the generator and discriminator are trained at the same time. The model looks like this: Figure 5: model architecture of the discriminator In general, generating new images is a harder task. Therefore, sometimes it can be beneficial to train the generator twice for each step. Whereas the discriminator will only be trained once. Another option is to set the learning rate for the discriminator a bit smaller than the learning rate for the generator. Tracking the performance of GANs can be tricky. Sometimes a lower loss doesn’t represent a better output. That’s why it’s a good idea to output the generated images during the training process. In the following figure we can see the digits generated by a GAN after 20 epochs. Figure 6: example output of generated MNIST images As we have stated in the introduction, GANs didn’t get much traction until 2016. GANs were mostly unstable and hard to train. Small adjustments in the model or training parameter resulted in unsatisfying results. Advancements in model architecture and other improvements fixed some of the previous limitations and unlocked the real potential of GANs. An important improvement was introduced by Deep Convolutional GANs (DCGANs). DCGANs is a network architecture, where in both the discriminator and generator are fully convolutional. The output is more stable - for datasets with higher translation invariance, like the Fashion MNIST dataset. Figure 7: example of Fashion MNIST images generated by a Deep Convolutional Generative Adversarial Network (DCGAN) There is so much more to discover with GANs and there is huge potential still to be unlocked. According to Yann LeCun - one of the fathers of deep learning - GANs are the most important advancement in machine learning in the last 20 years. GANs can be used for many different applications, ranging from 3D face generation to upscaling resolution of images and text-to-image. GANs might be the stepping stone we have been waiting for to add creativity to machines. [author title="Author's Bio"]Indra den Bakker is an experienced deep learning engineer and mentor on Udacity. He is the founder of 23insights, a part of NVIDIA's Inception program—a machine learning start-up building solutions that transform the world’s most important industries. For Udacity, he mentors students pursuing a Nanodegree in deep learning and related fields, and he is also responsible for reviewing student projects. Indra has a background in computational intelligence and has worked for several years as a data scientist for IPG Mediabrands and Screen6 before founding 23insights. [/author]      
Read more
  • 0
  • 0
  • 46242

article-image-why-should-enterprises-use-splunk
Sunith Shetty
25 Jul 2018
4 min read
Save for later

Why should enterprises use Splunk?

Sunith Shetty
25 Jul 2018
4 min read
Splunk is a multinational software company that offers its core platform, Splunk Enterprise, as well as many related offerings built on the Splunk platform. The platform helps a wide variety of organizational personas, such as analysts, operators, developers, testers, managers, and executives. They get analytical insights from machine-created data. It collects, stores, and provides powerful analytical capabilities, enabling organizations to act on often powerful insights derived from this data. The Splunk Enterprise platform was built with IT operations in mind. When companies had IT infrastructure problems, troubleshooting and solving problems was immensely difficult, complicated, and manual. It was built to collect and make log files from IT systems searchable and accessible. It is commonly used for information security and development operations, as well as more advanced use cases for custom machines, Internet of Things, and mobile devices. Most organizations will start using Splunk in one of three areas: IT operations management, information security, or development operations (DevOps). In today's post, we will understand the thoughts, concepts, and ideas to apply Splunk to an organization level. This article is an excerpt from a book written by J-P Contreras, Erickson Delgado and Betsy Page Sigman titled Splunk 7 Essentials, Third Edition. IT operations IT operations have moved from predominantly being a cost center to also being a revenue center. Today, many of the world's oldest companies also make money based on IT services and/or systems. As a result, the delivery of these IT services must be monitored and, ideally, proactively remedied before failures occur. Ensuring that hardware such as servers, storage, and network devices are functioning properly via their log data is important. Organizations can also log and monitor mobile and browser-based software applications for any issues from software. Ultimately, organizations will want to correlate these sets of data together to get a complete picture of IT Health. In this regard, Splunk takes the expertise accumulated over the years and offers a paid-for application known as IT Server Intelligence (ITSI) to help give companies a framework for tackling large IT environments. Complicating matters for many traditional organizations is the use of Cloud computing technologies, which now drive log captured from both internally and externally hosted systems. Cybersecurity With the relentless focus in today's world on cybersecurity, there is a good chance your organization will need a tool such as Splunk to address a wide variety of Information Security needs as well. It acts as a log data consolidation and reporting engine, capturing essential security-related log data from devices and software, such as vulnerability scanners, phishing prevention, firewalls, and user management and behavior, just to name a few. Companies need to ensure they are protected from external as well as internal threats, and as a result offer the paid-for applications enterprise security and User behavior analytics (UBA). Similar to ITSI, these applications deliver frameworks to help companies meet their specific requirements in these areas. In addition to cyber-security to protect the business, often companies will have to comply with, and audit against, specific security standards, which can be industry-related, such as PCI compliance of financial transactions; customer-related, such as National Institute of Standards and Technologies (NIST) requirements in working with the the US government; or data privacy-related, such as the Health Insurance Portability and Accountability Act (HIPAA) or the European Union's General Data Protection Regulation (GPDR). Software development and support operations Commonly referred to as DevOps, Splunk's ability to ingest and correlate data from many sources solves many challenges faced in software development, testing, and release cycles. Using Splunk will help teams provide higher quality software more efficiently. Then, with the controls into the software in place, it will provide visibility into released software, its use and user behavior changes, intended or not. This set of use cases is particularly applicable to organizations that develop their own software. Internet of Things Many organizations today are looking to build upon the converging trends in computing, mobility and wireless communications and data to capture data from more and more devices. Examples can include data captured from sensors placed on machinery such as wind turbines, trains, sensors, heating, and cooling systems. These sensors provide access to the data they capture in standard formats such as JavaScript Object Notation (JSON) through application programming interfaces (APIs). To summarize, we saw how Splunk can be used at an organizational level for IT operations, cybersecurity, software development and support and the IoTs. To know more about how Splunk can be used to make informed decisions in areas such as IT operations, information security, and the Internet of Things., do checkout this book Splunk 7 Essentials, Third Edition. Create a data model in Splunk to enable interactive reports and dashboards Splunk leverages AI in its monitoring tools Splunk Industrial Asset Intelligence (Splunk IAI) targets Industrial IoT marketplace
Read more
  • 0
  • 0
  • 46070
article-image-mobile-forensics-data-on-the-move
Julian Ursell
31 Oct 2014
5 min read
Save for later

Data on the Move: The Growing Frontier of Mobile Forensics

Julian Ursell
31 Oct 2014
5 min read
"The autopsy report details that the victim was wearing a Google Glass at the time of death." "So it looks like we're through the looking glass on this one!" "Be respectful detective, a man just died." CSI: Miami-esque exchange aside, the continual advancements made in wearable smart technologies, such as the Google Glass, smart watches, and other peripherals mean the expertise and versatility of professional analysts working in the digital forensics space will face ever greater challenges in the future. The original innovation of smartphones steepened the learning curve for forensic investigators and analysts, who have been required to adapt to the rapid development of mobile systems approaching the computing power and intelligence of desktop computers. Since then, this difficulty has only escalated with the constant iteration of new mobile hardware capabilities and updates to mobile operating systems. The velocity at which mobile technology updates makes it a nightmare for analysts to keep up to speed with system architectures (whether Android, iOS, Windows, or Blackberry) so they have the ability to forensically examine devices in a range of critical, sometimes criminal, investigations. That’s even before considering knock-off phones and those that may have been on the wrong end of a baseball bat. For forensic experts, the art of data extraction is an imperative one to master, as crucial evidence lies in the artefacts stored on devices, and encompasses common system files such as texts, emails, call logs, pictures, videos, web histories, passwords, PINs, and unlock patterns, but also less typical objects stored on third-party applications. Geolocation data, timestamps, and user accounts can all provide key evidence to working out the what, where, when, how, why for an investigation. "Perishable" or anonymous messaging services such as Snapchat and Whisper add another dimension to the discoverability of data that is intended to be temporary or anonymous (although Whisper has come under fire recently for storing confidential data, contrary to the application’s anonymity promise). In cases where app data has been "destroyed" or anonymised, forensic technicians need to extract deleted data through manual decoding and even piece together the evidence, Columbo-style, to unravel the perpetrators and the crime. The sophistication of numerous third-party applications and the types of data they are capable of storing adds a considerable degree of complexity and demands a lot in terms of forensic method and data analysis. Mobile forensics is a developing discipline, and with the rise of smart wearables, there is yet another dimension for analysts to get to grips with in the future. The smartwatch is still in the infancy stage of sophistication and adoption among consumers, but the impending release of the Apple Watch, along with the already available Samsung Gear and Pebble Steel ranges indicate that the market is going to expand in the next few years, and this makes it likely that smartwatches will become another addition in the digital (mobile) forensics space. The interesting kink in smartwatch technology is the paired interface they must share with phones, as the devices must effectively be synced in order to function, so that the watch receives notifications (texts, calls) pushed from the phone. The event logs stored on both devices when phone and watch interact may prove to be an important forensic artefact should they ever be the cause of investigation, and while right now, native apps on smartwatches are on the limited side (contacts, calendar, media, weather), greater sophistication in the realm of smartwatch apps cannot be far away. A hugely intriguing layer for mobile forensics is brought by the Google Glass and its array of functionalities, as once it eventually becomes globally available it will become an important device for analysts to understand how to image and pull apart. The Glass can be used for typical smartphone activities, such as sending messages, making calls, taking pictures, and social media interaction, but it's the ability to enable on-the-fly navigation and translation out in the real world, along with voice commanded Google search and access to real-time information updates through Google Now that make it particularly fascinating from a forensics standpoint. Even considering the familiarity experts will have with Android systems, the unique properties of the Glass in its use of voice commands and the search and geospatial information it collects will potentially provide crucial artefacts in investigations. Examiners will need to know how to pull voice command event logs and parse timeline data, recover deleted visual data, analyse GPS usage and locations, and even determine when in time a Glass was on or off. A student in digital forensics has even begun attempting to forensically examine the Glass. At this point in time, Glass wearers are those select few chosen for the Explorer beta program, but we should fully expect—when the device becomes completely publically available—for it to become popular enough for it to make another significant addition to the field of smart device forensics. Apparently Google Glass carriers are split into two camps—‘Explorers’ and ‘Glassholes’. Whatever the persuasion, forensic investigators may be required to look through a glass, darkly, sooner than they think.
Read more
  • 0
  • 0
  • 45804

article-image-how-to-plan-a-system-migration-10-steps
Hari Vignesh
14 Nov 2017
6 min read
Save for later

How to plan a system migration in 10 steps

Hari Vignesh
14 Nov 2017
6 min read
How do I plan a system migration? A system migration refers to the process of moving an application from one environment to another (such as from an on-premises enterprise server to a cloud-based environment, from one server to another, or from cloud-to-cloud). You might, for example, migrate to or from custom-built on platforms like Microsoft Azure, the Google App Engine, Force.com, MySQL, or Amazon Web Services. Software migration is always a challenge, but fortunately many system migrations can be managed – and even automated - by a third-party middleware solution. Sometimes a system migration might be something smaller scale. You might want to move installed applications and data from one piece of hardware to another (as you would when you give your team new computers), rather than moving an app’s entire development environment. While this is pretty easy in technical terms, making sure it’s carefully managed for users is nevertheless important. Why migration? Migrations done to improve efficiency or bring all applications from a legacy system into a current one. That’s why it’s becoming such a pressing issue for many organizations as they seek to undergo ‘digital transformation’ or optimize their existing setup. Often, organizations will want to virtualize their software. This is ultimately about disassociating it with specific operating systems, instead hosting programs in separate environments for sandboxing at runtime. Here are some migration scenarios: Example 1: You want to move your team using Adobe Creative Cloud (CC) from old PCs to new Macs. You need to ensure that once team members are working on Macs with Adobe CC installed, they’re still able to use paths to the server to access all creative assets. Example 2: Your team uses custom software developed on one type of cloud environment — like Amazon Web Services (AWS) — and now your organization is moving en masse to Google Cloud Platform (GCP). You need to map each piece of functionality your app had on AWS to GCP, despite the major differences in how each environment operates. How to successfully plan a system migration The hard bit is actually planning and executing a system migration. There’s a lot that can go wrong from both a technical and people perspective. Here are 10 steps you should follow to remain (relatively) calm and in control when you’re making a move. Establish your cross-functional representatives. Because of the many hands required to see a software migration project through, and its long timeframe and far-off ROI, you need a champion in your corner — from every corner of the business. Get one key representative from each business function relevant to the software that’s moving — be it production, sales, accounting, IT, or another department. These people will help you gain continued support of the project as it continues without ROI yet and comes under budget threats during, say, a lean quarter. Frame the project for stakeholders. Be it department heads, the C-suite, or the board of directors, lay out the plan and just how essential it is for growth. Set up what the project entails, what it isn’t, and lay out goalposts for each phase. Whenever it comes under review, you’ll have this initial framework you and stakeholders agreed upon. Build a team of internal experts. Find technical experts within your organization who can assist with each part of the migration, even if you’re ultimately using a third-party vendor or software for the migration. Put these people in charge of cleaning or writing programs to clean existing data, knowing where everything is stored, and understanding limitations of the platforms on each end of the migration. Depending on the size of your organization, each member of this team may lead their own small team to handle their portion of the project. Take inventory of assets. There’s no way to judge a migration as successful if you’re not sure whether you lost any data along the way. In the case of data, some of your internal experts can check in on what is stored, making backups, and exporting to lightweight .CSV files or hard-copies (in the case of legal and other vital documentation). For software or applications, take inventory on each action and function possible with the software, how it interfaces with its databases, what it’s compatible with and what it isn’t, and the unique custom configurations it has that separate it from off-the-shelf software’s documentation. Create a risk assessment report. Using the section above on challenges, determine all relevant risks to the migration, including opportunity costs and compliance issues. This will be vital for getting final approval from stakeholders, and insulate project runners from being blindsided later. One of these risk assessment matrix templates can help you get started. Determine technical, time, and financial requirements. Work with individuals in finance to work out long-term budget needs and rates of approval over the whole project. Work with IT, developers, and engineering to figure out the technical aspects and requirements, what method of migration is appropriate, and who will be forced into downtime at what stages of the project. Compile all of this to figure out realistic timing and checkpoints in the migration. Create project management system for all parties. With the data you gathered in the previous step, and all the teams you’ve assembled (technical, cross-functional, and stakeholder teams), create a common project management hub where everyone can see progress, send messages, attach files and findings, and generally lend visibility into the process. It should be intuitive for all users. Set up the project management software with the budget and time expectations at each phase agreed upon. You can present this information to the stakeholders for final approval prior to project kickoff, and use it to submit regular reports to them as they request.  Perform the migration in phases. Depending on the appropriate methods, perform the migration and document every step. Use the project management tool to keep everyone informed and gather documentation. Along the way, when some employees inevitably leave or get added to the team, you can use this tool to quickly get them up to speed. Test cases after each phase. After each phase, test whatever you’ve migrated into the new environment, and document the outcomes. Regular testing and sandboxing will allow your team to catch problems early and regroup or change direction before data is lost and progress is wasted. Results. Once the migration is complete, record final results, and compare it to the goalposts set up and tracked in your project management tool. Combine all documentation and deliver a final report to stakeholders, and begin reaping the rewards of your newer, faster, better software, operating system, cloud environment, or whatever else you migrated. By following the steps above, you should find your system migration a little more stress free than it might otherwise be! Hari Vignesh Jayapalan is a Google Certified Android app developer, IDF Certified UI & UX Professional, street magician, fitness freak, technology enthusiast, and wannabe entrepreneur. He can be found on Twitter @HariofSpades.
Read more
  • 0
  • 1
  • 45742