Tech Guides

article-image-8-nosql-databases-compared

17 Jun 2015

5 min read

8 NoSQL Databases Compared

17 Jun 2015

NoSQL, or non-relational databases, are increasingly used in big data and real-time web applications. These databases are non-relational in nature and they provide a mechanism for storage and the retrieval of information that is not tabular. There are many advantages of using NoSQL database: Horizontal Scalability Automatic replication (using multiple nodes) Loosely defined or no schema (Huge advantage, if you ask me!) Sharding and distribution Recently we were discussing the possibility of changing our data storage from HDF5 files to some NoSQL system. HDF5 files are great for the storage and retrieval purposes. But now with huge data coming in we need to scale up, and also the hierarchical schema of HDF5 files is not very well suited for all sorts of data we are using. I am a bioinformatician working on data science applications to genomic data. We have genomic annotation files (GFF format), genotype sequences (FASTA format), phenotype data (tables), and a lot of other data formats. We want to be able to store data in a space and memory efficient way and also the framework should facilitate fast retrieval. I did some research on the NoSQL options and prepared this cheat-sheet. This will be very useful for someone thinking about moving their storage to non-relational databases. Also, data scientists need to be very comfortable with the basic ideas of NoSQL DB's. In the course Introduction to Data Science by Prof. Bill Howe (UWashinton) on Coursera, NoSQL DB's formed a significant part of the lectures. I highly recommend the lectures on these topics and this course in general. This cheat-sheet should also assist aspiring data scientists in their interviews. Some options for NoSQL databases: Membase: This is key-value type database. It is very efficient if you only need to quickly retrieve a value according to a key. It has all of the advantages of memcached when it comes to the low cost of implementation. There is not much emphasis on scalability, but lookups are very fast. It has a JSON format with no predefined schema. The weakness of using it for important data is that it's a pure key-value store, and thus is not queryable on properties. MongoDB: If you need to associate a more complex structure, such as a document to a key, then MongoDB is a good option. With a single query, you are going to retrieve the whole document and it can be a huge win. However using these documents like simple key/value stores would not be as fast and as space-efficient as Membase. Documents are the basic unit. Documents are in JSON format with no predefined schema. It makes integration of data easier and faster. Berkeley DB: It stores records in key-value pairs. Both key and value can be arbitrary byte strings, and can be of variable lengths. You can put native programming language data structures into the database without converting to a foreign record first. Storage and retrieval are very simple, but the application needs to know what the structure of a key and a value is in advance, it can't ask the DB. Simple data access services. No limit to the data types that can be stored. No special support for binary large objects (unlike some others) Berkeley DB v/s MongoDB: Berkeley DB has no partitioning while MongoDB supports sharding. MongoDB has some predefined data types like float, string, integer, double, boolean, date, and so on. Berkeley DB has key-value store and MongoDb has documents. Both are schema free. Berkeley DB has no support for Python, for example, although there are many third parties libraries. Redis: If you need more structures like lists, sets, ordered sets and hashes, then Redis is the best bet. It's very fast and provides useful data-structures. It just works, but don't expect it to handle every use-case. Nevertheless, it is certainly possible to use Redis as your primary data-store. But it is used less for distributed scalability, but optimizes high performance lookups at the cost of no longer supporting relational queries. Cassandra: Each key has values as columns and columns are grouped together into sets called column families. Thus each key identifies a row of a variable number of elements. A column family contains rows and columns. Each row is uniquely identified by a key. And each row has multiple columns. Think of a column family as a table, each key-value pair being a row. Unlike RDBMS, different rows in a column family don't have to share the same set of columns, and a column may be added to one or multiple rows at any time. A hybrid between a key-value and a column-oriented database. Has a partially defined schema. Can handle large amounts of data across many servers (clusters), is fault-tolerant and robust. Examples were originally written by Facebook for the Inbox search, and later replaced by HBase. HBase: It is modeled after Google's Bigtable DB. The deal use for HBase is in the situations when you need improved flexibility, great performance, scaling and have Big Data. The data structure is similar to Cassandra where you have column families. Built on Hadoop (HDFS), and can do MapReduce without any external support. Very efficient for storing sparse data . Big data (2 billion rows) is easy to deal with. Examples scalable email/messaging system with search. HBase V/S Cassandra: Hbase is more suitable for data warehousing and large scale data processing and analysis (indexing the web as in a search engine) and Cassandra is more apt for real time transaction processing and the serving of interactive data. Cassandra is more write-centric and HBase is more read-centric. Cassandra has multi- data center support, which can be very useful. Resources NoSQL explained Why NoSQL Big Table About the Author Janu Verma is a Quantitative Researcher at the Buckler Lab, Cornell University, where he works on problems in bioinformatics and genomics. His background is in mathematics and machine learning and he leverages tools from these areas to answer questions in biology.

0
0
27823

article-image-hot-chips-31-ibm-power10-amds-ai-ambitions-intel-nnp-t-cerebras-largest-chip-with-1-2-trillion-transistors-and-more

Fatema Patrawala

23 Aug 2019

7 min read

Hot Chips 31: IBM Power10, AMD’s AI ambitions, Intel NNP-T, Cerebras largest chip with 1.2 trillion transistors and more

Fatema Patrawala

23 Aug 2019

7 min read

Hot Chips 31, the premiere event for the biggest semiconductor vendors to highlight their latest architectural developments is held in August every year. The event this year was held at the Memorial Auditorium on the Stanford University Campus in California, from August 18-20, 2019. Since its inception it is co-sponsored by IEEE and ACM SIGARCH. Hot Chips is amazing for the level of depth it provides on the latest technology and the upcoming releases in the IoT, firmware and hardware space. This year the list of presentations for Hot Chips was almost overwhelming with a wide range of technical disclosures on the latest chip logic innovations. Almost all the major chip vendors and IP licensees involved in semiconductor logic designs took part: Intel, AMD, NVIDIA, Arm, Xilinx, IBM, were on the list. But companies like Google, Microsoft, Facebook and Amazon also took part. There are notable absences from the likes of Apple, who despite being on the Committee, last presented at the conference in 1994. Day 1 kicked off with tutorials and sponsor demos. On the cloud side, Amazon AWS covered the evolution of hypervisors and the AWS infrastructure. Microsoft described its acceleration strategy with FPGAs and ASICs, with details on Project Brainwave and Project Zipline. Google covered the architecture of Google Cloud with the TPU v3 chip. And a 3-part RISC-V tutorial rounded off by afternoon, so the day was spent well with insights into the latest cloud infrastructure and processor architectures. The detailed talks were presented on Day 2 and Day 3, below are some of the important highlights of the event: IBM’s POWER10 Processor expected by 2021 IBM which creates families of processors to address different segments, with different models for tasks like scale-up, scale-out, and now NVLink deployments. The company is adding new custom models that use new acceleration and memory devices, and that was the focus of this year’s talk at Hot Chips. They also announced about POWER10 which is expected to come with these new enhancements in 2021, they additionally announced, core counts of POWER10 and process technology. IBM also spoke about focusing on developing diverse memory and accelerator solutions to differentiate its product stack with heterogeneous systems. IBM aims to reduce the number of PHYs on its chips, so now it has PCIe Gen 4 PHYs while the rest of the SERDES run with the company's own interfaces. This creates a flexible interface that can support many types of accelerators and protocols, like GPUs, ASICs, CAPI, NVLink, and OpenCAPI. AMD wants to become a significant player in Artificial Intelligence AMD does not have an artificial intelligence–focused chip. However, AMD CEO Lisa Su in a keynote address at Hot Chips 31 stated that the company is working toward becoming a more significant player in artificial intelligence. Lisa stated that the company had adopted a CPU/GPU/interconnect strategy to tap artificial intelligence and HPC opportunity. She said that AMD would use all its technology in the Frontier supercomputer. The company plans to fully optimize its EYPC CPU and Radeon Instinct GPU for supercomputing. It would further enhance the system’s performance with its Infinity Fabric and unlock performance with its ROCM (Radeon Open Compute) software tools. Unlike Intel and NVIDIA, AMD does not have a dedicated artificial intelligence chip or application-specific accelerators. Despite this, Su noted, “We’ll absolutely see AMD be a large player in AI.” AMD is considering whether to build a dedicated AI chip or not. This decision will depend on how artificial intelligence evolves. Lisa explained that companies have been improving their CPU (central processing unit) performance by leveraging various elements. These elements are process technology, die size, TDP (thermal design power), power management, microarchitecture, and compilers. Process technology is the biggest contributor, as it boosts performance by 40%. Increasing die size also boosts performance in the double digits, but it is not cost-effective. While AMD used microarchitecture to boost EPYC Rome server CPU IPC (instructions per cycle) by 15% in single-threaded and 23% in multi-threaded workloads. This IPC improvement is above the industry average IPC improvement of around 5%–8%. Intel’s Nervana NNP-T and Lakefield 3D Foveros hybrid processors Intel revealed fine-grained details about its much-anticipated Spring Crest Deep Learning Accelerators at Hot Chips 31. The Nervana Neural Network Processor for Training (NNP-T) comes with 24 processing cores and a new take on data movement that's powered by 32GB of HBM2 memory. The spacious 27 billion transistors are spread across a 688mm2 die. The NNP-T also incorporates leading-edge technology from Intel-rival TSMC. Intel Lakefield 3D Foveros Hybrid Processors Intel in another presentation talked about Lakefield 3D Foveros hybrid processors that are the first to come to market with Intel's new 3D chip-stacking technology. The current design consists of two dies. The lower die houses all of the typical southbridge features, like I/O connections, and is fabbed on the 22FFL process. The upper die is a 10nm CPU that features one large compute core and four smaller Atom-based 'efficiency' cores, similar to an ARM big.LITTLE processor. Intel calls this a "hybrid x86 architecture," and it could denote a fundamental shift in the company's strategy. Finally, the company stacks DRAM atop the 3D processor in a PoP (package-on-Package) implementation. Cerebras largest chip ever with 1.2 trillion transistors California artificial intelligence startup Cerebras Systems introduced its Cerebras Wafer Scale Engine (WSE), the world’s largest-ever chip built for neural network processing. Sean Lie the Co-Founder and Chief Hardware Architect at Cerebras Lie presented the gigantic chip ever at Hot Chips 31. The 16nm WSE is a 46,225 mm2 silicon chip which is slightly larger than a 9.7-inch iPad. It features 1.2 trillion transistors, 400,000 AI optimized cores, 18 Gigabytes of on-chip memory, 9 petabyte/s memory bandwidth, and 100 petabyte/s fabric bandwidth. It is 56.7 times larger than the largest Nvidia graphics processing unit, which accommodates 21.1 billion transistors on a 815 mm2 silicon base. NVIDIA’s multi-chip solution for deep neural networks accelerator NVIDIA which announced about designing a test multi-chip solution for DNN computations at a VLSI conference last year, the company explained chip technology at Hot Chips 31 this year. It is currently a test chip which involves a multi-chip DL inference. It is designed for CNNs and has a RISC-V chip controller. It has 36 small chips, 8 Vector MACs per PE, and each chip has 12 PEs and each package has 6x6 chips. Few other notable talks at Hot Chips 31 Microsoft unveiled its new product Hololens 2.0 silicone. It has a holographic processor and a custom silicone. The application processor runs the app, and the HPU modifies the rendered image and sends to the display. Facebook presented details on Zion, its next generation in-memory unified training platform. Zion which is designed for Facebook sparse workloads, has a unified BFLOAT 16 format with CPU and accelerators. Huawei spoke about its Da Vinci architecture, a single Ascend 310 which can deliver 16 TeraOPS of 8-bit integer performance, support real-time analytics across 16 channels of HD video, and consume less than 8W of power. Xiling Versal AI engine Xilinx, the manufacturer of FPGAs, announced its new Versal AI engine last year as a way of moving FPGAs into the AI domain. This year at Hot Chips they expanded on its technology and more. Ayar Labs, an optical chip making startup, showcased results of its work with DARPA (U.S. Department of Defense's Defense Advanced Research Projects Agency) and Intel on an FPGA chiplet integration platform. The final talk on Day 3 ended with a presentation by Habana, they discussed about an innovative approach to scaling AI Training systems with its GAUDI AI Processor. AMD competes with Intel by launching EPYC Rome, world’s first 7 nm chip for data centers, luring in Twitter and Google Apple advanced talks with Intel to buy its smartphone modem chip business for $1 billion, reports WSJ Alibaba’s chipmaker launches open source RISC-V based ‘XuanTie 910 processor’ for 5G, AI, IoT and self-driving applications

0
0
27821

article-image-opencv-and-android-making-your-apps-see

Raka Mahesa

07 Jul 2016

6 min read

OpenCV and Android: Making Your Apps See

Raka Mahesa

07 Jul 2016

6 min read

Computer vision might sound like an exotic term, but it's actually a piece of technology that you can easily find in your daily life. You know how Facebook can automatically tag your friends in a photo? That's computer vision. Have you ever tried Google Image Search? That's computer vision too. Even the QR Code reader app in your phone employs some sort of computer vision technology. Fortunately, you don't have to conduct your own researches to implement computer vision, since that technology is easily accessible in the form of SDKs and libraries. OpenCV is one of those libraries, and it's open source too. OpenCV focuses on real-time computer vision, so it feels very natural when the library is extended to Android, a device that usually has a camera built in. However, if you're looking to implement OpenCV in your app, you will find the official documentations for the Android version a bit lagging behind the ever evolving Android development environment. But don't worry; this post will help you with that. Together we're going to add the OpenCV Android library and use some of its basic functions on your app. Requirements Before you get started, let’s make sure you have all the following requirements: Android Studio v1.2 or above Android 4.4 (API 19) SDK or above OpenCV for Android library v3.1 or above An Android device with a camera Importing the OpenCV Library All right, let's get started. Once you have downloaded the OpenCV library, extract it and you will find a folder named "sdk" in it. This "sdk" folder should contain folders called "java" and "native". Remember the location of these 2 folders, since we will get back to them soon enough. So now you need to create a new project with blank activity on Android Studio. Make sure to set the minimum required SDK to API 19, which is the lowest version that's compatible with the library. Next, import the OpenCV library. Open the File > New > Import Module... menu and point it to the "java" folder mentioned earlier, which will automatically copy the Java library to your project folder. Now that you have added the library as a module, you need to link the Android project to the module. Open the File > Project Structure... menu and select app. On the dependencies tab, press the + button, choose Module Dependency, and select the OpenCV module on the list that pops up. Next, you need to make sure that the module will be built with the same setting as your app. Open the build.gradle scripts for both the app and the OpenCV module. Copy the SDK version and tools version values in the app graddle script to the OpenCV graddle script. Once it's done, sync the gradle scripts and rebuild the project. Here are the values of my graddle script, but your script may differ based on the SDK version you used. compileSdkVersion 23 buildToolsVersion "23.0.0 rc2" defaultConfig { minSdkVersion 19 targetSdkVersion 23 } To finish importing OpenCV, you need to add the C++ libraries to the project. Remember the "native" folder mentioned earlier? There should be a folder named "libs" inside it. Copy the "libs" folder to the <project-name>/OpenCVLibrary/src/main folder and rename it to "jniLibs" so that Android Studio will know that the files inside that folder are C++ libraries. Sync the project again, and now OpenCV should have been imported properly to your project. Accessing the Camera Now that you’re done importing the library, it's time for the next step: accessing the device's camera. The OpenCV library has its own camera UI that you can use to easily access the camera data, so let’s use that. To do that, simply replace the layout XML file for your main activity with this one. Then you'll need to ask permission from the user to access the camera. Add the following line to the app manifest. <uses-permission android_name="android.permission.CAMERA"/> And if you're building for Android 6.0 (API 23), you will need to ask for permission inside the app. Add the following line to the onCreate() function of your main activity to ask for permission. requestPermissions(new String[] { Manifest.permission.CAMERA }, 1); There are two things to note about the camera UI from the library. First, by default, it will not show anything unless it's activated in the app by calling the enableView() function. And second, on portrait orientation, the camera will display a rotated view. Fixing this last issue is quite a hassle, so let’s just choose to lock the app to landscape orientation. Using OpenCV Library With the preparation out of the way, let's start actually using the library. Here's the code for the app's main activity if you want to see how the final version works. To use the library, initialize it by calling the OpenCVLoader.initAsync() method on the activity's onResume() method. This way the activity will always check if the OpenCV library has been initialized every time the app is going to use it. //Create callback protected LoaderCallbackInterface mCallback = new BaseLoaderCallback(this) { @Override public void onManagerConnected(int status) { //If not success, call base method if (status != LoaderCallbackInterface.SUCCESS) super.onManagerConnected(status); else { //Enable camera if connected to library if (mCamera != null) mCamera.enableView(); } } }; @Override protected void onResume() { //Super super.onResume(); //Try to init OpenCVLoader.initAsync(OpenCVLoader.OPENCV_VERSION_2_4_10, this, mCallback); } The initialization process will check if your phone already has the full OpenCV library. If it doesn't, it will automatically open the Google Play page for the OpenCV Manager app and ask the user to install it. And if OpenCV has been initialized, it simply activates the camera for further use. If you noticed, the activity implements the CvCameraViewListener2 interface. This interface enables you to access the onCameraFrame() method, which is a function that allows you to read what image the camera is capturing, and to return what image the interface should be showing. Let's try a simple image processing and show it on the screen. @Override public Mat onCameraFrame(CameraBridgeViewBase.CvCameraViewFrame inputFrame) { //Get edge from the image Mat result = new Mat(); Imgproc.Canny(inputFrame.rgba(), result, 70, 100); //Return result return result; } Imgproc.Canny() is an OpenCV function that does Canny Edge Detection, which is a process to detect all edges in a picture. As you can see, it's pretty simple; you simply need to put the image from the camera (inputFrame.rgba()) into the function and it will return another image that shows only the edges. Here's what the app’s display will look like. And that's it! You've implemented a pretty basic feature from the OpenCV library on an Android app. There are still many image processing features that the library has, so check out this exhaustive list of features for more. Good luck! About the author Raka Mahesa is a game developer at Chocoarts who is interested in digital technology in general. Outside of work hours, he likes to work on his own projects, with Corridoom VR being his latest released game. Raka also regularly tweets as @legacy99.

0
0
27794

article-image-6-key-skills-data-scientist-role

Amey Varangaonkar

21 Dec 2017

6 min read

6 Key Areas to focus on while transitioning to a Data Scientist role

Amey Varangaonkar

21 Dec 2017

6 min read

[box type="note" align="" class="" width=""]The following article is an excerpt taken from the book Statistics for Data Science, authored by James D. Miller. The book dives into the different statistical approaches to discover hidden insights and patterns from different kinds of data.[/box] Being a data scientist is undoubtedly a very lucrative career prospect. In fact, it is one of the highest paying jobs in the world right now. That said, transitioning from a data developer role to a data scientist role needs careful planning and a clear understanding of what the role is all about. In this interesting article, the author highlights six key skills must give special attention to, during this transition. Let's start by taking a moment to state what I consider to be a few generally accepted facts about transitioning to a data scientist. We'll reaffirm these beliefs as we continue through this book: Academia: Data scientists are not all from one academic background. They are not all computer science or statistics/mathematics majors. They do not all possess an advanced degree (in fact, you can use statistics and data science with a bachelor's degree or even less). It's not magic-based: Data scientists can use machine learning and other accepted statistical methods to identify insights from data, not magic. They are not all tech or computer geeks: You don't need years of programming experience or expensive statistical software to be effective. You don't need to be experienced to get started. You can start today, right now. (Well, you already did when you bought this book!) Okay, having made the previous declarations, let's also be realistic. As always, there is an entry-point for everything in life, and, to give credit where it is due, the more credentials you can acquire to begin out with, the better off you will most likely be. Nonetheless, (as we'll see later in this chapter), there is absolutely no valid reason why you cannot begin understanding, using, and being productive with data science and statistics immediately. [box type="info" align="" class="" width=""]As with any profession, certifications and degrees carry the weight that may open the doors, while experience, as always, might be considered the best teacher. There are, however, no fake data scientists but only those with currently more desire than practical experience.[/box] If you are seriously interested in not only understanding statistics and data science but eventually working as a full-time data scientist, you should consider the following common themes (you're likely to find in job postings for data scientists) as areas to focus on: Education Common fields of study here are Mathematics and Statistics, followed by Computer Science and Engineering (also Economics and Operations research). Once more, there is no strict requirement to have an advanced or even related degree. In addition, typically, the idea of a degree or an equivalent experience will also apply here. Technology You will hear SAS and R (actually, you will hear quite a lot about R) as well as Python, Hadoop, and SQL mentioned as key or preferable for a data scientist to be comfortable with, but tools and technologies change all the time so, as mentioned several times throughout this chapter, data developers can begin to be productive as soon as they understand the objectives of data science and various statistical mythologies without having to learn a new tool or language. [box type="info" align="" class="" width=""]Basic business skills such as Omniture, Google Analytics, SPSS, Excel, or any other Microsoft Office tool are assumed pretty much everywhere and don't really count as an advantage, but experience with programming languages (such as Java, PERL, or C++) or databases (such as MySQL, NoSQL, Oracle, and so on.) does help![/box] Data The ability to understand data and deal with the challenges specific to the various types of data, such as unstructured, machine-generated, and big data (including organizing and structuring large datasets). [box type="info" align="" class="" width=""]Unstructured data is a key area of interest in statistics and for a data scientist. It is usually described as data having no redefined model defined for it or is not organized in a predefined manner. Unstructured information is characteristically text-heavy but may also contain dates, numbers, and various other facts as well.[/box] Intellectual Curiosity I love this. This is perhaps well defined as a character trait that comes in handy (if not required) if you want to be a data scientist. This means that you have a continuing need to know more than the basics or want to go beyond the common knowledge about a topic (you don't need a degree on the wall for this!) Business Acumen To be a data developer or a data scientist you need a deep understanding of the industry you're working in, and you also need to know what business problems your organization needs to unravel. In terms of data science, being able to discern which problems are the most important to solve is critical in addition to identifying new ways the business should be leveraging its data. Communication Skills All companies look for individuals who can clearly and fluently translate their findings to a non-technical team, such as the marketing or sales departments. As a data scientist, one must be able to enable the business to make decisions by arming them with quantified insights in addition to understanding the needs of their non-technical colleagues to add value and be successful. This article paints a much clearer picture on why soft skills play an important role in becoming a better data scientist. So why should you, a data developer, endeavor to think like (or more like) a data scientist? Specifically, what might be the advantages of thinking like a data scientist? The following are just a few notions supporting the effort: Developing a better approach to understanding data Using statistical thinking during the process of program or database designing Adding to your personal toolbox Increased marketability If you found this article to be useful, make sure you check out the book Statistics for Data Science, which includes a comprehensive list of tips and tricks to becoming a successful data scientist by mastering the basic and not-so-basic concepts of statistics.

0
0
27786

article-image-webassembly-trick-or-treat

Prasad Ramesh

31 Oct 2018

1 min read

WebAssembly - Trick or Treat?

Prasad Ramesh

31 Oct 2018

1 min read

WebAssembly is a low level language that works in binary and close with the machine code. It defines an AST in a binary format. In this language, you can create and debug code in plain text format. It made popular appearance in many browsers last year and is catching on due to its ability to run heavier apps with speed on a browser window. There are Tools and languages built for it. Why are developers excited about WebAssembly? Developers are excited about this as it can potentially run heavy desktop games and applications right inside your browser window. As Mozilla shares plans to bring more functionality to WebAssembly, modern day web browsing will become more robust. However, the language used by this, WASM, poses some security threats. This is because WASM binary applications cannot be checked for tampers. Some features are even being held back from WebAssembly till it is more secure against attacks like Spectre and Meltdown.

0
0
27769

article-image-understanding-security-features-in-the-google-cloud-platform-gcp

Vincy Davis

27 Jul 2019

10 min read

Understanding security features in the Google Cloud Platform (GCP)

Vincy Davis

27 Jul 2019

10 min read

Google's long experience and success in, protecting itself against cyberattacks plays to our advantage as customers of the Google Cloud Platform (GCP). From years of warding off security threats, Google is well aware of the security implications of the cloud model. Thus, they provide a well-secured structure for their operational activities, data centers, customer data, organizational structure, hiring process, and user support. Google uses a global scale infrastructure to provide security to build commercial services, such as Gmail, Google search, Google Photos, and enterprise services, such as GCP and gsuite. This article is an excerpt taken from the book, "Google Cloud Platform for Architects.", written by Vitthal Srinivasan, Janani Ravi and Et al. In this book, you will learn about Google Cloud Platform (GCP) and how to manage robust, highly available, and dynamic solutions to drive business objective. This article gives an insight into the security features in Google Cloud Platform, the tools that GCP provides for users benefit, as well as some best practices and design choices for security. Security features at Google and on the GCP Let's start by discussing what we get directly by virtue of using the GCP. These are security protections that we would not be able to engineer for ourselves. Let's go through some of the many layers of security provided by the GCP. Datacenter physical security: Only a small fraction of Google employees ever get to visit a GCP data center. Those data centers, the zones that we have been talking so much about, probably would seem out of a Bond film to those that did—security lasers, biometric detectors, alarms, cameras, and all of that cloak-and-dagger stuff. Custom hardware and trusted booting: A specific form of security attacks named privileged access attacks are on the rise. These involve malicious code running from the least likely spots that you'd expect, the OS image, hypervisor, or boot loader. There is the only way to really protect against these, which is to design and build every single element in-house. Google has done that, including hardware, a firmware stack, curated OS images, and a hardened hypervisor. Google data centers are populated with thousands of servers connected to a local network. Google selects and validates building components from vendors and designs custom secure server boards and networking devices for server machines. Google has cryptographic signatures on all low-level components, such as BIOS, bootloader, kernel, and base OS, to validate the correct software stack is booting up. Data disposal: The detritus of the persistent disks and other storage devices that we use are also cleaned thoroughly by Google. This data destruction process involves several steps: an authorized individual will wipe the disk clean using a logical wipe. Then, a different authorized individual will inspect the wiped disk. The results of the erasure are stored and logged too. Then, the erased driver is released into inventory for reuse. If the disk was damaged and could not be wiped clean, it is stored securely and not reused, and such devices are periodically destroyed. Each facility where data disposal takes place is audited once a week. Data encryption: By default GCP always encrypts all customer data at rest as well as in motion. This encryption is automatic, and it requires no action on the user's part. Persistent disks, for instance, are already encrypted using AES-256, and the keys themselves are encrypted with master keys. All these key management and rotation is managed by Google. In addition to this default encryption, a couple of other encryption options exist as well, more on those in the following diagram: Secure service deployment: Google's security documentation will often refer to secure service deployment, and it is important to understand that in this context, the term service has a specific meaning in the context of security: a service is the application binary that a developer writes and runs on infrastructure. This secure service deployment is based on three attributes: Identity: Each service running on Google infrastructure has an associated service account identity. A service has to submit cryptographic credentials provided to it to prove its identity while making or receiving remote procedure calls (RPC) to other services. Clients use these identities to make sure that they are connecting to an intended server and the server will use to restrict access to data and methods to specific clients. Integrity: Google uses a cryptographic authentication and authorization technique at an application layer to provide strong access control at the abstraction level for interservice communication. Google has an ingress and egress filtering facility at various points in their network to avoid IP spoofing. With this approach, Google is able to maximize their network's performance and its availability. Isolation: Google has an effective sandbox technique to isolate services running on the same machine. This includes Linux user separation, language and kernel-based sandboxes, and hardware virtualization. Google also secures operation of sensitive services such as cluster orchestration in GKE on exclusively dedicated machines. Secure interservice communication: The term inter-service communication refers to GCP's resources and services talking to each other. For doing so, the owners of the services have individual whitelists of services which can access them. Using them, the owner of the service can also allow some IAM identities to connect with the services managed by them.Apart from that, Google engineers on the backend who would be responsible to manage the smooth and downtime-free running of the services are also provided special identities to access the services (to manage them, not to modify their user-input data). Google encrypts interservice communication by encapsulating application layer protocols in RPS mechanisms to isolate the application layer and to remove any kind of dependency on network security. Using Google Front End: Whenever we want to expose a service using GCP, the TLS certificate management, service registration, and DNS are managed by Google itself. This facility is called the Google Front End (GFE) service. For example, a simple file of Python code can be hosted as an application on App Engine that (application) will have its own IP, DNS name, and so on. In-built DDoS protections: Distributed Denial-of-Service attacks are very well studied, and precautions against such attacks are already built into many GCP services, notably in networking and load balancing. Load balancers can actually be thought of as hardened, bastion hosts that serve as lightning rods to attract attacks, and so are suitably hardened by Google to ensure that they can withstand those attacks. HTTP(S) and SSL proxy load balancers, in particular, can protect your backend instances from several threats, including SYN floods, port exhaustion, and IP fragment floods. Insider risk and intrusion detection: Google constantly monitors activities of all available devices in Google infrastructure for any suspicious activities. To secure employees' accounts, Google has replaced phishable OTP second factors with U2F, compatible security keys. Google also monitors its customer devices that employees use to operate their infrastructure. Google also conducts a periodic check on the status of OS images with security patches on customer devices. Google has a special mechanism to grant access privileges named application-level access management control, which exposes internal applications to only specific users from correctly managed devices and expected network and geographic locations. Google has a very strict and secure way to manage its administrative access privileges. They have a rigorous monitoring process of employee activities and also a predefined limit for administrative accesses for employees. Google-provided tools and options for security As we've just seen, the platform already does a lot for us, but we still could end up leaving ourselves vulnerable to attack if we don't go about designing our cloud infrastructure carefully. To begin with, let's understand a few facilities provided by the platform for our benefit. Data encryption options: We have already discussed Google's default encryption; this encrypts pretty much everything and requires no user action. So, for instance, all persistent disks are encrypted with AES-256 keys that are automatically created, rotated, and themselves encrypted by Google. In addition to default encryption, there are a couple of other encryption options available to users. Customer-managed encryption keys (CMEK) using Cloud KMS: This option involves a user taking control of the keys that are used, but still storing those keys securely on the GCP, using the key management service. The user is now responsible for managing the keys that are for creating, rotating and destroying them. The only GCP service that currently supports CMEK is BigQuery and is in beta stage for Cloud Storage. Customer-supplied encryption keys (CSEK): Here, the user specifies which keys are to be used, but those keys do not ever leave the user's premises. To be precise, the keys are sent to Google as a part of API service calls, but Google only uses these keys in memory and never persists them on the cloud. CSEK is supported by two important GCP services: data in cloud storage buckets as well as by persistent disks on GCE VMs. There is an important caveat here though: if you lose your key after having encrypted some GCP data with it, you are entirely out of luck. There will be no way for Google to recover that data. Cloud security scanner: Cloud security scanner is a GCP, provided security scanner for common vulnerabilities. It has long been available for App Engine applications, but is now also available in alpha for Compute Engine VMs. This handy utility will automatically scan and detect the following four common vulnerabilities: Cross-site scripting (XSS) Flash injection Mixed content (HTTP in HTTPS) The use of outdated/insecure libraries Like most security scanners, it automatically crawls an application, follows links, and tries out as many different types of user input and event handlers as possible. Some security best practices Here is a list of design choices that you could exercise to cope with security threats such as DDoS attacks: Use hardened bastion hosts such as load balancers (particularly HTTP(S) and SSL proxy load balancers). Make good use of the firewall rules in your VPC network. Ensure that incoming traffic from unknown sources, or on unknown ports, or protocols is not allowed through. Use managed services such as Dataflow and Cloud Functions wherever possible; these are serverless and so have smaller attack vectors. If your application lends itself to App Engine it has several security benefits over GCE or GKE, and it can also be used to autoscale up quickly, damping the impact of a DDOS attack. If you are using GCE VMs, consider the use of API rate limits to ensure that the number of requests to a given VM does not increase in an uncontrolled fashion. Use NAT gateways and avoid public IPs wherever possible to ensure network isolation. Use Google CDN as a way to offload incoming requests for static content. In the event of a storm of incoming user requests, the CDN servers will be on the edge of the network, and traffic into the core infrastructure will be reduced. Summary In this article, you learned that the GCP benefits from Google's long experience countering cyber-threats and security attacks targeted at other Google services, such as Google search, YouTube, and Gmail. There are several built-in security features that already protect users of the GCP from several threats that might not even be recognized as existing in an on-premise world. In addition to these in-built protections, all GCP users have various tools at their disposal to scan for security threats and to protect their data. To know more in-depth about the Google Cloud Platform (GCP), head over to the book, Google Cloud Platform for Architects. Ansible 2 for automating networking tasks on Google Cloud Platform [Tutorial] Build Hadoop clusters using Google Cloud Platform [Tutorial] Machine learning APIs for Google Cloud Platform

0
0
27769

article-image-top-languages-for-artificial-intelligence-development

Natasha Mathur

05 Jun 2018

11 min read

Top languages for Artificial Intelligence development

Natasha Mathur

05 Jun 2018

11 min read

0
0
27744

article-image-dark-web-phishing-kits-cheap-plentiful-and-ready-to-trick-you

Guest Contributor

07 Dec 2018

6 min read

Dark Web Phishing Kits: Cheap, plentiful and ready to trick you

Guest Contributor

07 Dec 2018

6 min read

Spam email is a part of daily life on the internet. Even the best junk mail filters will still allow through certain suspicious looking messages. If an illegitimate email tries to persuade you to click a link and enter personal information, then it is classified as a phishing attack. Phishing attackers send out email blasts to large groups of people with the messages designed to look like they come from a reputable company, such as Google, Apple, or a banking or credit card firm. The emails will typically try to warn you about an error with your account and then urge you to click a link and log in with your credentials. Doing so will bring you to an imitation website where the attacker will attempt to steal your password, social security number, or other private data. These days phishing attacks are becoming more widespread. One of the primary reasons is because of easy access to cybercrime kits on the dark web. With the hacker community growing, internet users need to take privacy seriously and remain vigilant against spam and other threats. Read on to learn more about this trend and how to protect yourself. Dark Web Basics The dark web, sometimes referred to as the deep web, operates as a separate environment on the internet. Normal web browsers, like Google Chrome or Mozilla Firefox, connect to the world wide web using the HTTP protocol. The dark web requires a special browser tool known as the TOR browser, which is fully encrypted and anonymous. Image courtesy of Medium.com Sites on the dark web cannot be indexed by search engines, so you'll never stumble on that content through Google. When you connect through the TOR browser, all of your browsing traffic is sent through a global overlay network so that your location and identity cannot be tracked. Even IP addresses are masked on the dark web. Hacker Markets Much of what takes place in this cyber underworld is illegal or unethical in nature, and that includes the marketplaces that exist there. Think of these sites as blackmarket versions of eBay, where anonymous individuals can buy and sell illegal goods and services. Recently, dark web markets have seen a surge in demands for cybercrime tools and utilities. Entire phishing kits are sold to buyers, which include spoofed pages that imitate real companies and full guides on how to launch an email phishing scam. Image courtesy of Medium.com When a spam email is sent out as part of a phishing scam, the messages are typically delivered through dark web servers that make it hard for junk filters to identify. In addition, the "From" address in the emails may look legitimate and use a valid domain like @gmail.com. Phishing kits can be found for as less as two dollars, meaning that inexperienced hackers can launch a cybercrime effort with little funding or training. It’s interesting to note that personal data prices at the Dark Web supermarket range from a single dollar (Social Security card) to thousands (medical records). Cryptocurrency Scandal You should be on the lookout for phishing scandals related to any company or industry, but in particular, banking and financial attacks can be the most dangerous. If a hacker gains access to your credit card numbers or online banking password, then can commit fraud or even steal your identity. The growing popularity of cryptocurrencies like Bitcoin and Ether have revolutionized the financial industry, but as a negative result of the trend, cybercriminals are now targeting these digital money systems. MyEtherWallet website, which allows users to store blockchain currency in a central location, has been victim to a number of phishing scams in recent months. Image courtesy ofMyEtherWallet.com Because cryptocurrencies do not operate with a central bank or financial authority, you may not know what a legitimate email alert for one looks like. Phishing messages for MyEtherWallet will usually claim that there is an issue with your cryptocurrency account, or sometimes even suggest that you have a payment pending that needs to be verified. Clicking on the link in the phishing email will launch your web browser and navigate to a spoofed page that looks like it is part of myetherwallet.com. However, the page is actually hosted on the hacker's network and will feed directly into their illegitimate database. If you enter your private wallet address, which is a unique string of letters and numbers, the hacker can gain access to all of the funds in your account. Preventative Measures Phishing attacks are a type of cybercrime that targets individuals, so it's up to you to be on guard for these messages and react appropriately. The first line of defense against phishing is to be skeptical of all emails that enter your inbox. Dark web hackers are getting better and better at imitating real companies with their spam and spoofing pages, so you need to look closely when examining the content. Always check the full URL of the links in email messages before you click one. If you do get tricked and end up navigating to a spoofed page in your web browser, you still have a chance to protect yourself. All browsers support secure sockets layer (SSL) functionality and will display a lock icon or a green status bar at the top of the window when a website has been confirmed as legitimate. If you navigate to a webpage from an email that does not have a valid SSL certificate, you should close the browser immediately and permanently delete the email message. The Bottom Line Keep this in mind. As prices for phishing kits drop and supply increases, the allure of engaging in this kind of bad behavior will be too much to resist for an increasing number of people. Expect incidents of phishing attempts will increase. The general internet-browsing public should stay on high alert at all times when navigating their email inbox. Think first, then click. Author Bio Gary Stevens is a front-end developer. He’s a full-time blockchain geek and a volunteer working for the Ethereum foundation as well as an active Github contributor. Packt has put together a new cybersecurity bundle for Humble Bundle Malicious code in npm ‘event-stream’ package targets a bitcoin wallet and causes 8 million downloads in two months Why scepticism is important in computer security: Watch James Mickens at USENIX 2018 argue for thinking over blindly shipping code

0
0
27723

article-image-how-to-become-an-exceptional-performance-engineer

Guest Contributor

14 Dec 2019

8 min read

How to become an exceptional Performance Engineer

Guest Contributor

14 Dec 2019

8 min read

0
0
27713

article-image-7-android-predictions-for-2019

Guest Contributor

13 Jan 2019

8 min read

7 Android Predictions for 2019

Guest Contributor

13 Jan 2019

8 min read

Emerging technologies not only change the way users interact with their devices but they also improve the development process. One such tech where most features emerge is Google’s Android. The Android App Development platform is coming up with new features every year at a neck-breaking pace These are some of the safest Android predictions which can be made for Android development in the year 2019. #1 Voice Command and Virtual Assistants Voice command simply dictates the user’s voice into an electronic word processed document which allows users to operate the system by talking to it and also frees up cognitive working space. It also has some potential drawbacks - it requires a large amount of memory to store voice data files and is difficult to operate in crowded places due to noise interference. What does it have in store for 2019? In 2019, voice search is going to create a new user interface that would be a mandate to take into consideration when developing and designing applications in mobile apps. Voice Assistants are gaining much popularity and we can see every big player has the one such as Siri, Google Assistant, Bixby, Alexa, Cortana. This will continue to grow in 2019. Use Case App: Pingpong Board The use case for voice assistants is to create an application similar to the Ping Pong board. Inside the application, there are two screens: the First screen shows available players with the leaderboard and scores and the second screen displays two players who are playing the game along with their game points. #2 Chatbots Chatbots are trending as they support faster customer service at low labor costs by increasing customer satisfaction. However, simple chatbots are often limited to give a response to the customers which could frustrate them whereas complex chatbots cost more, inhibiting their widespread adoption. What Next? As per the technology experts, it is predicted the whole world is going to introduce their company by Chatbots. The customer support service will be provided efficiently and the customer feedbacks will also be responded quicker to get the better results. Chatbots are a takeaway in this digital world as every application or website wants to provide this facility for the improvised customer support. Chatbots can be taken as the small assistants which are integrated into our applications. We can create our one with the help of DialogFlow which is easy to develop without much coding. Nowadays, facebook messenger is used in spite of being a messaging app as many of the chatbots are integrated into it. Use Case: Allstate chatbot The largest P&C insurer in America developed its own 'ABle' chatbot to help their agents learn to sell commercial insurance products. The bot teaches agents through the commercial selling process and can extract documents and also understands which product an agent is working on and where are they in the process. #3 Virtual and Augmented Reality Augmented Reality systems are highly interactive in nature and operate simultaneously with the real-time environment by reducing the thin line between real and virtual world; enhancing the perceptions and interactions with the real world. It is expensive to develop AR based devices for the desired projects, lack privacy and low-performance level are few drawbacks for the AR compliant devices. What next? The hardware for VR is initially driven by the hardcore games and gadget freaks where the mobile hardware is been caught up in some instances excluding the traditional computing platforms. For the real-world uses with Augmented Reality and sensor into the mobile devices like never before, AR and VR are combined to get much better visibility of applications which seems that virtual reality revolution is finally arriving. Use Case App: MarXent + AR AR is helping professionals to visualize their final products during the creative process from interior design to architecture and construction. By using the headsets enabled by architects, engineers, and design professionals can step directly into their buildings and spaces to look at how their designs might look and can even make virtual spot changes. #4 Android App Architecture Google has finally introduced guidelines after many years to develop the best Android apps. Even you are not forced to use Android architecture components but it is considered a good starting point to build stable applications. The argument about the best pattern for Android - MVC, MVP, MVVM or anything else has turned off and we can trust the solutions from Google which are good enough for all majority of apps. What next? The developers always face confusion implementing the multithreading on Android and to solve these problems, tools like Async Task and EventBus are supporting it. Also, we can choose RxJava, Kotlin Coroutines or Android LiveData for multithreading management. This fetches more stability and less confusion in the developer community. Loads of applications are installed on our mobile devices but we hardly use some of them. For this, Progressive WebApps are becoming popular in e-commerce. #5 Hybrid Solutions Big companies like Facebook is leading in utilizing the cross platforms for most of the part, it is a pragmatic approach where the larger the audience the bigger the market share for advertising and subscription revenues. What Next? The hybrid mobile applications come with the unified development that can substantially save a good amount of money and provide fast deployment through offline support and bridges the gap between other two approaches providing all the extra functionality with very little overhead. The hybrid applications can possibly result in the loss of performance and make the developer rely on the framework itself to nicely play with the targeted operating system. So, escaping out of the traditional hardware and software solutions, the developers have approached the market aiming to offer a total solution or cross-platform solutions. #6 Machine Learning Google switched to AI first from mobile first strategy since some time. This is clearly seen in the TensorFlow and MLKit in the Firebase ecosystem which is gaining popularity for creating simple basic models that do not need expertise in data sciences to make your application intelligent. People are getting more aware of the capabilities of machine learning along with its implementation in Matlab or R for mobile development. What next? Machine learning is used in a variety of applications for banking and financial sector, healthcare, retail, publishing and social media etc. Also, used by Google and Facebook to push down relevant advertisements based on past user search. The major challenge is to implement machine learning by implementing different techniques and interpretation of results which is also complex but important not only for image and speech recognition but also for user behavior prediction and analysis. Machine learning will be utilized in the future for Quantum computing to manipulate and classify large numbers of vectors in high-dimensional spaces. We expect to have better-unsupervised algorithms in building smarter applications that will lead to faster and more accurate outcomes. #7 Rooting Android Rooting Android means to get root access or administrative rights for your device. No matter how much you pay for your device the internals of the device is still locked away. With the help of Rooting Android, several advantages are offered of removing the pre-installed OEM applications, ad-blocking for all the apps which is a great benefit to the user. What next? As the rooting android installs the incompatible application on your device it can brick your device and it is advised to always get your apps from reliable sources. It does not come with a warranty and a wrong setting can move the wrong item which can cause huge problems. The risk with the rooted devices is that the system might not get well updated later which can create errors. Still, It also provides more display options and internal storage along with the greater battery life and speed. It will also make full device backups and have access to root files. Conclusion The year 2019 is going to very interesting for Android app development. We will observe a lot of new technologies emerging that will change the face of mobile development for future use. The developers need to stay up to date with the emerging trends and learn quickly to implement them in designing new products. We can definitely see a bright future by more good quality apps with even more engaging user interactions. We also expect to have more stable solutions to develop applications which result in better products. It becomes important to observe closely the new trends and become a quick learner in mastering these skills that will be the most important in the future. Author Bio Rooney Reeves is a content strategist and a technical blogger associated with eTatvaSoft. An old hand writer by day and an avid reader by night, she has a vast experience in writing about new products, software design, and test-driven methodology. Read Next 8 programming languages to learn in 2019 18 people in tech every programmer and software engineer need to follow in 2019 Cloud computing trends in 2019

0
0
27643

article-image-a-cybersecurity-primer-for-mid-sized-businesses

Guest Contributor

26 Jul 2019

7 min read

A cybersecurity primer for mid sized businesses

Guest Contributor

26 Jul 2019

7 min read

0
0
27536

article-image-the-best-backend-tools-in-web-development

Sugandha Lahoti

06 Jun 2018

5 min read

The best backend tools in web development

Sugandha Lahoti

06 Jun 2018

5 min read

If you’re a backend developer, it’s easy to feel overwhelmed by the range of backend development tools available. It goes without saying that you should use what works for you but sometimes it’s not that easy to even work that out. With this in mind, this year’s Skill Up report offers a useful insight into some of the most popular backend tools being used today. Let’s take a look at what tools came out on top. That should help you make decisions about what you’re going to use or maybe even just learn. Read the Skill Up report in full. Sign up to our weekly newsletter and download the PDF for free. Node.js More than 50% respondents said, they prefer Node.js, the popular server-side Javascript coding framework. Node.js is a Javascript runtime that runs on the V8 JavaScript runtime engine. Node.js adds capabilities to Javascript (front-end language) to let it do more than just creating interactive websites. It uses an event-driven, non-blocking I/O model that makes it lightweight and efficient. The latest stable release of Node, Node 10, will be the next candidate in line for the Long Term Support (LTS) in October 2018. Node.js 10.0 comes with plenty of new features like OpenSSL 1.1.0 security toolkit, upgraded npm, N-API, and much more. Get started with learning Node.js with the following books: Learning Node.js Development Learn Node.js by Building 6 Projects RESTful Web API Design with Node.js 10 - Third Edition ASP.NET Core The next popular alternative was ASP. NET Core with over 25% developers approving it as their choice of backend framework. ASP.NET Core is the open-source cross-platform framework for building backends, web apps and services, and IoT apps. According to the skill-up survey, it was also one of the most popular framework used by developers. It provides a cloud-ready, environment-based configuration system. It seamlessly integrates with popular client-side frameworks and libraries, including Angular, React, and Bootstrap. Get started with ASP.NET Core by reading: Learning ASP.NET Core 2.0 Mastering ASP.NET Core 2.0 ASP.NET Core 2 High Performance - Second Edition Express.js Developers and tech pros also like to work with Express JS, and hence it ranked No. 3 on our list. Express JS is the pre-built Node JS framework that can help developers build faster and smarter websites and web apps. Express basically extends Node.js to build complete web apps. It is the perfect framework to learn for developers, who are fluent in Node.js, but want to transition to creating apps from just server-side technologies. Express is lightweight and comes with extra, built-in web application features and the Express API to support the already robust, feature-packed Node.js platform. Express is not just limited to NodeJS. It also works seamlessly with other modules and offers HTTP utilities and middleware for creating APIs. It can help developers master single-page and multiple-page websites, as well as some complex web apps. You can go through Projects in ExpressJS [Video], a complete course to learn professional web development using Express.js. Laravel Next, was Laravel, a prominent member of a new generation of web frameworks. It is one of the most popular PHP frameworks and is also free and and open source. It features: A simple, fast routing engine Powerful dependency injection container Multiple back-ends for session and cache storage Database agnostic schema migrations Robust background job processing Real-time event broadcasting The latest stable release, Laravel 5 is a substantial upgrade with a lot of new toys, at the same time retaining the features that made Laravel wildly successful. It comes with plenty of architectural as well as design-based changes. Start building with Laravel with these videos. Beginning Laravel [Video] Laravel Foundations: Basics to Every App [Video] Java EE The fifth most popular choice of backend tool is the Java EE. The Enterprise Java standard or Java EE is a collection of technologies and APIs for the Java platform designed to support Enterprise. By enterprise, we mean applications classified as large-scale, distributed, transactional and highly-available, designed to support mission-critical business requirements. Applications written to comply with the Java EE specification do not tie developers to a specific vendor; instead, they can be deployed to any Java EE compliant application server. The Java EE server application implements the Java EE platform APIs and provides the standard Java EE services. The latest stable release, Java EE 8 brings with it a load of features, mainly targeting newer architectures such as microservices, modernized security APIs, and cloud deployments. Our best picks for learning Java EE: Java EE 8 Application Development Architecting Modern Java EE Applications Java EE 8 High Performance The other backend tools which were among the top picks by developers included: Spring, a programming and configuration model for building modern Java-based enterprise applications, on any kind of deployment platform. Django, a powerful Python web framework for creating RESTful web services. It reduces the amount of trivial code, which simplifies the creation of web applications and results in faster development. Flask, a framework for building web servers in Python. It is a micro framework, meaning it’s not a full stack web application development framework. It just gives the developers very basics to get a web server running. Firebase, Google’s mobile platform to help developers run mobile backend code without managing servers and develop high-quality apps. Ruby on Rails, one of the oldest, backend technology. A certain percentage of people still prefer using ruby on rails for their backend code. Rails is a flexible and IDE friendly framework with easy functions and manipulations and the support of the powerful ruby language. The entire skill up survey report can be read on the Packt website, which details on what developers think about the changing tech landscape and the parameters that are driving that change. This survey report is launched at the start of the Skill Up campaign, where every eBook and video will be available for $10. Go grab your free content now!

0
0
27511

article-image-a-tale-of-two-tools-tableau-and-power-bi

Natasha Mathur

07 Jun 2018

11 min read

A tale of two tools: Tableau and Power BI

Natasha Mathur

07 Jun 2018

11 min read

Business professionals are on a constant look-out for a powerful yet cost-effective BI tool to ramp up the operational efficiency within organizations. Two tools that are front-runners in the Self-Service Business Intelligence field currently are Tableau and Power BI. Both tools, although quite similar in nature, offer different features. Most experts say that the right tool depends on the size, needs and the budget of an organization, but when compared closely, one of them clearly beats the other in terms of its features. Now, instead of comparing the two based on their pros and cons, we’ll let Tableau and Power BI take over from here to argue their case covering topics like features, usability, to pricing and job opportunities. For those of you who aren’t interested in a good story, there is a summary of the key points at the end of the article comparing the two tools. [box type="shadow" align="" class="" width=""] The clock strikes 2’o'clock for a meeting on a regular Monday afternoon. Tableau, a market leader in Business Intelligence & data analytics and Power BI; another standout performer and Tableau’s opponent in the field of Business Intelligence head off for a meeting with the Vendor. The meeting where the vendor is finally expected to decide to which tool their organization should pick for their BI needs. With Power BI and Tableau joining the Vendor, the conversation starts on a light note with both tools introducing themselves to the Vendor. Tableau: Hi, I am Tableau, I make it easy for companies all around the world to see and understand their data. I provide different visualization tools, drag & drop features, metadata management, data notifications, etc, among other exciting features. Power BI: Hello, I am Power BI, I am a cloud-based analytics and Business Intelligence platform. I provide a full overview of critical data to organizations across the globe. I allow companies to easily share data by connecting the data sources and helping them create reports. I also help create scalable dashboards for visualization. The vendor nods convincingly in agreement while making notes about the two tools. Vendor: May I know what each one of you offers in terms of visualization? Tableau: Sure, I let users create 24 different types of baseline visualizations including heat maps, line charts and scatter plots. Not trying to brag, but you don’t need intense coding knowledge to develop high quality and complex visualizations with me. You can also ask me ‘what if’ questions regarding the data. I also provide unlimited data points for analysis. The vendor seems noticeably pleased with Tableau’s reply. Power BI: I allow users to create visualizations by asking questions in natural language using Cortana. Uploading data sets is quite easy with me. You can select a wide range of visualizations as blueprints. You can then insert data from the sidebar into the visualization. Tableau passes a glittery infectious smirk and throws a question towards Power BI excitedly. Tableau: Wait, what about data points? How many data points can you offer? The Vendor looks at Power BI with a straight face, waiting for a reply. Power BI: For now, I offer 3500 data points for data analysis. Vendor: Umm, okay, but, won’t the 3500 data point limit the effectiveness for the users? Tableau cuts off Power BI as it tries to answer and replies back to the vendor with a distinct sense of rush in its voice. Tableau: It will! Due to the 3500 data point limit, many visuals can't display a large amount of data, so filters are added. As the data gets filtered automatically, it leads to outliers getting missed. Power BI looks visibly irritated after Tableau’s response and looks at the vendor for slight hope, while vendor seems more inclined towards Tableau. Vendor: Okay. Noted. What can you tell me about your compatibility with data sources? Tableau: I support hundreds of data connectors. This includes online analytical processing (OLAP), big data options (such as NoSQL, Hadoop) as well as cloud options. I am capable of automatically determining the relationship between data when added from multiple sources. I also let you modify data links or create them manually based on your company’s preferences. Power BI: I help connect to users’ external sources including SAP HANA, JSON, MySQL, and more. When data is added from multiple sources, I can automatically determine the relationships between them. In fact, I let users connect to Microsoft Azure databases, third-party databases, files and online services like Salesforce and Google Analytics. Vendor: Okay, that’s great! Can you tell me what your customer support is like? Tableau jumps in to answer the question first yet again. Tableau: I offer direct support by phone and email. Customers can also login to the customer portal to submit a support ticket. Subscriptions are provided based on three different categories namely desktop, server and online. Also, there are support resources for different subscription version of the software namely Desktop, Server, and Online. Users are free to access the support resources depending upon the version of the software. I provide getting started guides, best practices as well as how to use the platform’s top features. A user can also access Tableau community forum along with attending training events. The vendor seems highly pleased with Tableau’s answer and continues scribbling in his notebook. Power BI: I offer faster customer support to users with a paid account. However, all users can submit a support ticket. I also provide robust support resources and documentation including learning guides, a user community forum and samples of how my partners use the platform. Though customer support functionality is limited for users with a free Power BI account. Vendor: Okay, got it! Can you tell me about your learning curves? Do you get along well with novice users too or just professionals? Tableau: I am a very powerful tool and data analysts around the world are my largest customer base. I must confess, I am not quite intuitive in nature but given the powerful visualization features that I offer, I see no harm in people getting themselves acquainted with data science a bit before they decide to choose me. In a nutshell, it can be a bit tricky to transform and clean visualizations with me for novices. Tableau looks at the vendor for approval but he is just busy making notes. Power BI: I am the citizen data scientists’ ally. From common stakeholders to data analysts, there are features for almost everyone on board as far as I am concerned. My interface is quite intuitive and depends more on drag and drop features to build visualizations. This makes it easy for the users to play around with the interface a bit. It doesn’t matter whether you’re a novice or pro, there’s space for everyone here. A green monster of jealousy takes over Tableau as it scoffs at Power BI. Tableau: You are only compatible with Windows. I, on the other hand, am compatible with both Windows and Mac OS. And let’s be real it’s tough to do even simple calculations with you, such as creating a percent-of-total variable, without learning the DAX language. As the flood of anger rises in Power BI, Vendor interrupts them. Vendor: May I just ask one last question before I get ready with the results? How heavy are you on my pockets? Power BI: I offer three subscription plans namely desktop, pro, and premium. Desktop is the free version. Pro is for professionals and starts at $9.99 per user per month. You get additional features such as data governance, content packaging, and distribution. I also offer a 60 day trial with Pro. Now, coming to Premium, it is built on a capacity pricing. What that means is that I charge you per node per month. You get even more powerful features such as premium version cost calculator for custom quote ranges. This is based on the number of pro, frequent and occasional users that are active on an account’s premium version. The vendor seems a little dazed as he continues making notes. Tableau: I offer three subscriptions as well, namely Desktop, Server, and Online. Prices are charged per user per month but billed annually. Desktop category comes with two options: Personal edition (starting at $35) and professional edition (starting at $70). The server option offers on-premises or public cloud capabilities, starting at $35 while the Online version is fully hosted and starts at $42. I also offer a free version namely Tableau Public with which users can create visualizations, save them and share them on social media or their blog. There is a 10GB storage limit though. I also offer 14 days free trial for users so that they can get a demo before the purchase. Tableau and Power BI both await anxiously for the Vendor’s reply as he continued scribbling in his notebook while making weird quizzical expressions. Vendor: Thank you so much for attending this meeting. I’ll be right back with the results. I just need to check on a few things. Tableau and power BI watch the vendor leave the room and heavy anticipation fills the room. Tableau: Let’s be real, I will always be the preferred choice for data visualization. Power BI: We shall see that. Don’t forget that I also offer data visualization tools along with predictive modeling and reporting. Tableau: I have a better job market! Power BI: What makes you say that? I think you need to re-check the Gartner’s Magic Quadrant as I am right beside you on that. Power BI looks at Tableau with a hot rush of astonishment as the Vendor enters the room. The vendor smiles at Tableau as he continues the discussion which makes Power BI slightly uneasy. Vendor: Tableau and Power BI, you both offer great features but as you know I can only pick one of you as my choice for the organizations. An air of suspense surrounds the atmosphere. Vendor: Tableau, you are a great data visualization tool with versatile built-in features such as user interface layout, visualization sharing, and intuitive data exploration. Power BI, you offer real-time data access along with some pretty handy drag and drop features. You help create visualizations quickly and provide even the novice users an access to powerful data analytics without any prior knowledge. The tension notched up even more as the Vendor kept talking. Vendor: Tableau! You’re a great data visualization tool but the price point is quite high. This is one of the reasons why I choose Microsoft Power BI. Microsoft Power BI offers data visualization, connects to external data sources, lets you create reports, etc, all at low cost. Hence, Power BI, welcome aboard! A sense of infinite peace and pride emanates from Power BI. The meeting ends with Power BI and Vendor shaking hands as Tableau silently leaves the room. [/box] We took a peek into the Vendor’s notebook and saw this comparison table. Power BI Tableau Visualization capabilities Good Very Good Compatibility with multiple Data sources Good Good Customer Support Quality Good Good Learning Curve Very Good Good System Compatibility Windows Windows & Mac OS Cost Low Very high Job Market Good Good Analytics Very Good Good Both the Business Intelligence tools are in demand by organizations all over the world. Tableau is fast and agile. It provides a comprehensible interface along with visual analytics where users have the ability to ask and answer questions. Its versatility and success stories make it a good choice for organizations willing to invest in a higher budget Business Intelligence software. Power BI, on the other hand, offers almost similar features as Tableau including data visualization, predictive modeling, reporting, data prep, etc, at one of the lowest subscription prices today in the market. Nevertheless, there are upgrades being made to both of the Business Intelligence tools, and we can only wait to see what’s more to come in these technologies. Building a Microsoft Power BI Data Model “Tableau is the most powerful and secure end-to-end analytics platform”: An interview with Joshua Milligan“ Unlocking the secrets of Microsoft Power BI

0
0
27416

article-image-aws-fargate-makes-container-infrastructure-management-a-piece-of-cake

Savia Lobo

17 Apr 2018

3 min read

AWS Fargate makes Container infrastructure management a piece of cake

Savia Lobo

17 Apr 2018

3 min read

Containers such as Docker, FreeBSD Jails, and many more, are a substantial way for developers to develop and deploy their applications. Also, with container orchestration solutions such as Amazon ECS and EKS (Kubernetes), developers can easily manage and scale these containers, thus enabling them to perform other activities quickly. However, in spite of these management solutions at hand, one also has to take an account of the infrastructure maintenance, its availability, capacity and so on which are added tasks. AWS Fargate eases out these tasks and streamlines all deployments for you, resulting in faster completion of deliverables. At the Re:Invent in November 2017, AWS launched Fargate, a technology which enables one to manage containers without having to worry about managing the container infrastructure underneath. AWS Fargate comes to your rescue here. It is an easy way to deploy your containers on AWS. One can start using Fargate on ECS or EKS, try processes and workloads and later migrate workloads to Fargate. It eliminates most of the management such as placement of resources, scheduling, scaling, and so on, which is a requirement for containers. All you have to do is, Build your container image, Specify the CPU and memory requirements, Define your networking and IAM policies, and Launch your container application Some key benefits of AWS Fargate It allows developers to focus on design, development, and deployment of applications. This eliminates the need to manage a cluster of Amazon EC2 instances. One can easily scale applications using Fargate. Once, the application requirements such as CPU, memory, and so on are defined, Fargate manages effective scaling and infrastructure needed to make containers highly-available. One can launch thousands of containers in no time and easily scale them to run most of the mission-critical applications. AWS Fargate is integrated with Amazon ECS and EKS. Fargate launches and manages containers once CPU and memory needed, IAM policies that container needs are defined and uploaded to Amazon ECS. With Fargate, one gets flexible configuration options that matches one’s applications’ needs. Also, one pays on the basis of per-second granularity. Adoption of Container management as a trend is steadily increasing. Kubernetes, at present, is one of the popular and most used containerized application management platforms. However, users and developers are often confused about who the best Kubernetes provider is. Microsoft and Google have their managed Kubernetes services, but AWS Fargate provides an added ease to Amazon’s EKS (Elastic Container Service for Kubernetes) by eliminating the hassle of container infrastructure management. Read more about AWS Fargate on AWS’ official website.

0
0
27406

article-image-deoldify-colorising-and-restoring-bw-images-and-videos-using-a-nogan-approach

Savia Lobo

17 May 2019

5 min read

DeOldify: Colorising and restoring B&W images and videos using a NoGAN approach

Savia Lobo

17 May 2019

5 min read

Wouldn’t it be magical if we could watch old black and white movie footages and images in color? Deep learning, more precisely, GANs can help here. A recent approach by a software researcher Jason Antic tagged as ‘DeOldify’ is a deep learning based project for colorizing and restoring old images and film footages. https://twitter.com/johnbreslin/status/1127690102560448513 https://twitter.com/johnbreslin/status/1129360541955366913 In one of the sessions at the recent Facebook Developer Conference held from April 30 - May 1, 2019, Antic, along with Jeremy Howard, and Uri Manor talked about how by using GANs one can reconstruct images and videos, such as increasing their resolution or adding color to a black and white film. However, they also pointed out that GANs can be slow, and difficult and expensive to train. They demonstrated how to colorize old black & white movies and drastically increase the resolution of microscopy images using new PyTorch-based tools from fast.ai, the Salk Institute, and DeOldify that can be trained in just a few hours on a single GPU. https://twitter.com/citnaj/status/1123748626965114880 DeOldify makes use of a NoGAN training, which combines the benefits of GAN training (wonderful colorization) while eliminating the nasty side effects (like flickering objects in the video). NoGAN training is crucial while getting some images or videos stable and colorful. An example of DeOldify trying to achieve a stable video is as follows: Source: GitHub Antic said, “the video is rendered using isolated image generation without any sort of temporal modeling tacked on. The process performs 30-60 minutes of the GAN portion of "NoGAN" training, using 1% to 3% of Imagenet data once. Then, as with still image colorization, we "DeOldify" individual frames before rebuilding the video.” The three models in DeOldify DeOldify includes three models including video, stable and artistic. Each of the models has its strengths and weaknesses, and their own use cases. The Video model is for video and the other two are for images. Stable https://twitter.com/johnbreslin/status/1126733668347564034 This model achieves the best results with landscapes and portraits and produces fewer zombies (where faces or limbs stay gray rather than being colored in properly). It generally has less unusual miscolorations than artistic, but it's also less colorful in general. This model uses a resnet101 backbone on a UNet with an emphasis on width of layers on the decoder side. This model was trained with 3 critic pretrain/GAN cycle repeats via NoGAN, in addition to the initial generator/critic pretrain/GAN NoGAN training, at 192px. This adds up to a total of 7% of Imagenet data trained once (3 hours of direct GAN training). Artistic https://twitter.com/johnbreslin/status/1129364635730272256 This model achieves the highest quality results in image coloration, with respect to interesting details and vibrance. However, in order to achieve this, one has to adjust the rendering resolution or render_factor. Additionally, the model does not do as well as ‘stable’ in a few key common scenarios- nature scenes and portraits. Artistic model uses a resnet34 backbone on a UNet with an emphasis on depth of layers on the decoder side. This model was trained with 5 critic pretrain/GAN cycle repeats via NoGAN, in addition to the initial generator/critic pretrain/GAN NoGAN training, at 192px. This adds up to a total of 32% of Imagenet data trained once (12.5 hours of direct GAN training). Video https://twitter.com/citnaj/status/1124719757997907968 The Video model is optimized for smooth, consistent and flicker-free video. This would definitely be the least colorful of the three models; while being almost close to the ‘stable’ model. In terms of architecture, this model is the same as "stable"; however, differs in training. It's trained for a mere 2.2% of Imagenet data once at 192px, using only the initial generator/critic pretrain/GAN NoGAN training (1 hour of direct GAN training). DeOldify was achieved by combining certain approaches including: Self-Attention Generative Adversarial Network: Here, Antic has modified the generator, a pre-trained U-Net, to have the spectral normalization and self-attention. Two Time-Scale Update Rule: It’s just one to one generator/critic iterations and higher critic learning rate. This is modified to incorporate a "threshold" critic loss that makes sure that the critic is "caught up" before moving on to generator training. This is particularly useful for the "NoGAN" method. NoGAN doesn’t have a separate research paper. This, in fact, is a new type of GAN training developed to solve some key problems in the previous DeOldify model. NoGAN includes the benefits of GAN training while spending minimal time doing direct GAN training. Antic says, “I'm looking to make old photos and film look reeeeaaally good with GANs, and more importantly, make the project useful.” “I'll be actively updating and improving the code over the foreseeable future. I'll try to make this as user-friendly as possible, but I'm sure there's going to be hiccups along the way”, he further added. To further know about the hardware components and other details head over to Jason Antic’s GitHub page. Training Deep Convolutional GANs to generate Anime Characters [Tutorial] Sherin Thomas explains how to build a pipeline in PyTorch for deep learning workflows Using deep learning methods to detect malware in Android Applications

0
0
27397

8 NoSQL Databases Compared

Hot Chips 31: IBM Power10, AMD’s AI ambitions, Intel NNP-T, Cerebras largest chip with 1.2 trillion transistors and more

OpenCV and Android: Making Your Apps See

6 Key Areas to focus on while transitioning to a Data Scientist role

WebAssembly - Trick or Treat?

Understanding security features in the Google Cloud Platform (GCP)

Top languages for Artificial Intelligence development

Dark Web Phishing Kits: Cheap, plentiful and ready to trick you

How to become an exceptional Performance Engineer

7 Android Predictions for 2019

Trending Topics

A cybersecurity primer for mid sized businesses

The best backend tools in web development

A tale of two tools: Tableau and Power BI

AWS Fargate makes Container infrastructure management a piece of cake

DeOldify: Colorising and restoring B&W images and videos using a NoGAN approach