SlideShare a Scribd company logo
2
OPERATING SYSTEM
REPORT TITLE
DESIGNING A MODEL
FOR IMPROVING CPU SCHEDULING
BY USING MACHINE LEARNING
SUBMITTED BY
MUSKAN RATH
IIIT Bhubaneswar
3
OPERATING SYSTEM
REPORT TITLE
DESIGNING A MODEL FOR IMPROVING CPU SCHEDULING BY
USING MACHINE LEARNING
Problem Statement
The main concern of our paper describes that we are proposing a model for a uniprocessor system for
improving CPU scheduling. Our model is implemented at low-level language or assembly language and
LINUX is used for the implementation of the model as it is an open-source environment and its kernel is
editable.
There are several methods to predict the length of the CPU-bursts, such as exponential averaging method,
however these methods may not give an accurate or reliable predicted values. In this paper, we will
propose a Machine Learning (ML) based best approach to estimate the length of the CPU-bursts for
processes. We will make use of Bayesian Theory for our model as a classifier tool that will decide which
process will execute first in the ready queue. The proposed approach aims to select the most significant
attributes of the process using feature selection techniques and then predicts the CPU-burst for the process
in the grid. Furthermore, applying attribute selection techniques improves the performance in terms of
space, time and estimation.
Material and Methods
1.Bayesian Decision Theory
The methodology of selecting process which will be first executed on the ready queue involves two
phases. They are comparison of static and dynamic properties of new processes in the queue with the
properties of the dataset of the previously executed process. Data comes from a process that is not
completely known. The dataset is divided into two categories of processes; useful processes and not-
useful processes. The new process will be categorized as either useful or not useful depending on the
results of the comparison of properties. Furthermore, we can find the probability of the new process in the
ready queue to be executed given that the previous process has been executed as:
P(A|B)=P(A∩B)/P(B)
4
=> P(A∩B)=P(A|B).P(B) --------(i)
Also, P(B|A)=P(B∩A)/P(A)
=> P(B∩A)=P(B|A).P(A) ----------(ii)
Since (i) and (ii) are equal. We have,
P(A|B).P(B)=P(B|A).P(A)
In the above formula, A is a new process which is yet to be executed and B is a previous process which
has been executed. The process A is a hypothesis while process B is the evidence or data. In the above
case, we are finding the probability of hypothesis when we have given some evidence or data. By
knowing the probability of a new process occurring we can optimise CPU Scheduling. But if wrong
decision is taken, then there might be losses. A decision rule α(.) takes input x and outputs a decision α(x).
We will usually require that α(.) lies in a class of decision rules A, i.e. α(.) ∈ A. A is sometimes called
the hypothesis class. In Bayes Decision Theory there are usually no restrictions placed on A (i.e. all rules
α(.) are allowed). In Machine Learning, we will usually put restrictions on A to ensure that we have
enough data to learn them. The loss function L(α(x), y) is the cost you pay if you make decision α(x), but
the true state is y.To put everything together, we have :
likelihood function: p(x|y) x ∈ X, y ∈ Y
prior: p(y)
decision rule: α(x) α(x) ∈ Y
loss function: L(α(x), y) cost of making decision α(x) when true state is y.
The risk function combines the loss function, the decision rule, and the probabilities. More precisely, the
risk of a decision rule α(.) is the expected loss L(., .) with respect to the probabilities p(., .). R(α) = X x,y
L(α(x), y)P(x, y) (Note: if x takes continuous values (instead of discrete values) then we replace P x,y by
P y R dx.) According to Bayes Decision Theory one has to pick the decision rule ˆα which minimizes the
risk. αˆ = arg min α∈A R(α), i.e. R(ˆα) ≥ R(α) ∀α ∈ A (set of all decision rules). αˆ is the Bayes
Decision R(ˆα) is the Bayes Risk.
2.Proposed Approach
Our purpose in the proposed model is to reduce the possibility of selecting an inappropriate process that
may increase the waiting time of all other processes waiting for CPU. Furthermore, (throughput) will be
5
decreased on selecting the process which will take maximum time of CPU. Bayesian Decision Theory
(BDT), works on previous knowledge and distribution of the data from which we have to select the
appropriate data item expecting to achieve the target. Our model proposes the data set of 100 execution
instances of five programs: (1) matrix multiplication, (2) quick sort, (3) merges sort, (4) heap sort and (5)
a recursive Fibonacci number generator. The data collection may be performed by saving the process
control blocks of the executed processes. Data of about 100 instances of the five programs is enough and
made into 02 categories; useful and not-useful processes.
Training and Testing methodology: We proposed two types of tests on the training examples with all the
learners described in the section, BDT will be applied as classifier, on the data sets collected in the first
phase.
The tests are:
Use Training Set: The classifier is evaluated on how well it predicts the class of the instance it was trained
on.
Cross-Validation: The classifier can be evaluated by cross-validation, using the number of processes that
are entered in the system. Recognition accuracy can be tested via cross validation.
BASIC APPROACH
1) The programs are run according to different time slices using the scheduler in order to find the best
STS that is the best time slice which will help in giving the minimum turnaround time , that is the
minimum time required to complete the entire process of a CPU Scheduling.
2) Taking the help of basic static and dynamic properties of our process, we fetch the properties to the
BDT(Bayesian Classifier) in order to classify the processes into useful and non useful categories in order
to help us determine which process should be scheduled first.
3) If a new program comes, classify it and run the program with this predicted STS.
4) If the new program instance is not in the knowledge-base, go to step 1.
5) BDT works as effective classifier to classify the process which may or may not be useful process for
the system from both user and system point of view. The BDT is solely based on probabilistic and
statistical data so as a result the ratio of accuracy of selecting the appropriate process may vary from time
to time.
6
A variety of criteria are used in designing the real-time scheduler. Some of these criteria relate to the
behavior of the system as perceived by the individual user (user oriented), while others view the total
effectiveness of the system in meeting the needs of all users (system oriented). Some of the criteria relate
specifically to quantitative measures of performance, while others are more qualitative in nature. From a
user’s point of view, response time is generally the most important characteristic of a system, while from
a system point of view, throughput or processor utilization is important. In this work, BDT works as
effective classifier to classify the process which may or may not be useful process for the system from
both user and system point of view. The BDT is solely based on probabilistic and statistical data so as a
result the ratio of accuracy of selecting the appropriate process may vary from time to time.
7

More Related Content

PDF
A SERIAL COMPUTING MODEL OF AGENT ENABLED MINING OF GLOBALLY STRONG ASSOCIATI...
PPT
13. Query Processing in DBMS
PDF
The International Journal of Engineering and Science (The IJES)
PPTX
Query processing and Query Optimization
PDF
A hybrid fuzzy ann approach for software effort estimation
PDF
Optimal feature selection from v mware esxi 5.1 feature set
PDF
C0413016018
A SERIAL COMPUTING MODEL OF AGENT ENABLED MINING OF GLOBALLY STRONG ASSOCIATI...
13. Query Processing in DBMS
The International Journal of Engineering and Science (The IJES)
Query processing and Query Optimization
A hybrid fuzzy ann approach for software effort estimation
Optimal feature selection from v mware esxi 5.1 feature set
C0413016018

What's hot (17)

PDF
Test case optimization in configuration testing using ripper algorithm
PPT
Chapter15
PDF
A Novel Methodology to Implement Optimization Algorithms in Machine Learning
PDF
SigOpt_Bayesian_Optimization_Primer
PPTX
Query optimization
PDF
mlsys_portrait
PDF
DETECTION OF RELIABLE SOFTWARE USING SPRT ON TIME DOMAIN DATA
PDF
Study on Sorting Algorithm and Position Determining Sort
PDF
Sca a sine cosine algorithm for solving optimization problems
PPTX
Application of Principal Components Analysis in Quality Control Problem
PDF
Selection Sort with Improved Asymptotic Time Bounds
PDF
Query trees
PDF
Feature selection using modified particle swarm optimisation for face recogni...
PDF
Using the black-box approach with machine learning methods in ...
PDF
BINARY SINE COSINE ALGORITHMS FOR FEATURE SELECTION FROM MEDICAL DATA
Test case optimization in configuration testing using ripper algorithm
Chapter15
A Novel Methodology to Implement Optimization Algorithms in Machine Learning
SigOpt_Bayesian_Optimization_Primer
Query optimization
mlsys_portrait
DETECTION OF RELIABLE SOFTWARE USING SPRT ON TIME DOMAIN DATA
Study on Sorting Algorithm and Position Determining Sort
Sca a sine cosine algorithm for solving optimization problems
Application of Principal Components Analysis in Quality Control Problem
Selection Sort with Improved Asymptotic Time Bounds
Query trees
Feature selection using modified particle swarm optimisation for face recogni...
Using the black-box approach with machine learning methods in ...
BINARY SINE COSINE ALGORITHMS FOR FEATURE SELECTION FROM MEDICAL DATA
Ad

Similar to A report on designing a model for improving CPU Scheduling by using Machine Learning (20)

PDF
Performance Comparision of Machine Learning Algorithms
PPTX
Timetable management system(chapter 3)
PDF
International Journal of Computational Engineering Research(IJCER)
PDF
Software Process Control on Ungrouped Data: Log-Power Model
PDF
Genetic Algorithm for Process Scheduling
PDF
Optimal Feature Selection from VMware ESXi 5.1 Feature Set
DOC
Predictive job scheduling in a connection limited system using parallel genet...
DOC
genetic paper
PDF
Adaptive check-pointing and replication strategy to tolerate faults in comput...
PDF
E01113138
PPTX
Chapter 1 Data structure.pptx
PDF
Modeling of multiversion concurrency control
PDF
Robust Fault-Tolerant Training Strategy Using Neural Network to Perform Funct...
PDF
Bt0081 software engineering2
PDF
Optimization of workload prediction based on map reduce frame work in a cloud...
PDF
Optimization of workload prediction based on map reduce frame work in a cloud...
PDF
Assessing Software Reliability Using SPC – An Order Statistics Approach
PDF
Assessing Software Reliability Using SPC – An Order Statistics Approach
PDF
IRJET- Expert Independent Bayesian Data Fusion and Decision Making Model for ...
PDF
Abstraction Refinement And Proof For Probabilistic Systems 1st Edition Annabe...
Performance Comparision of Machine Learning Algorithms
Timetable management system(chapter 3)
International Journal of Computational Engineering Research(IJCER)
Software Process Control on Ungrouped Data: Log-Power Model
Genetic Algorithm for Process Scheduling
Optimal Feature Selection from VMware ESXi 5.1 Feature Set
Predictive job scheduling in a connection limited system using parallel genet...
genetic paper
Adaptive check-pointing and replication strategy to tolerate faults in comput...
E01113138
Chapter 1 Data structure.pptx
Modeling of multiversion concurrency control
Robust Fault-Tolerant Training Strategy Using Neural Network to Perform Funct...
Bt0081 software engineering2
Optimization of workload prediction based on map reduce frame work in a cloud...
Optimization of workload prediction based on map reduce frame work in a cloud...
Assessing Software Reliability Using SPC – An Order Statistics Approach
Assessing Software Reliability Using SPC – An Order Statistics Approach
IRJET- Expert Independent Bayesian Data Fusion and Decision Making Model for ...
Abstraction Refinement And Proof For Probabilistic Systems 1st Edition Annabe...
Ad

More from MuskanRath1 (6)

DOCX
CryptoImpact.docx
DOCX
IMPACT OF BITCOIN ON 21st CENTURY.docx
PPTX
Bitcoin ppt.pptx
PDF
Big data analytics of ev charging stations
PDF
Connected cars article
PDF
A review on power quality disturbance classification using deep learning appr...
CryptoImpact.docx
IMPACT OF BITCOIN ON 21st CENTURY.docx
Bitcoin ppt.pptx
Big data analytics of ev charging stations
Connected cars article
A review on power quality disturbance classification using deep learning appr...

Recently uploaded (20)

PDF
Design an Analysis of Algorithms II-SECS-1021-03
PDF
Understanding Forklifts - TECH EHS Solution
PDF
Upgrade and Innovation Strategies for SAP ERP Customers
PDF
Nekopoi APK 2025 free lastest update
PPTX
L1 - Introduction to python Backend.pptx
PDF
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
PPTX
Operating system designcfffgfgggggggvggggggggg
PPTX
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
PDF
How to Migrate SBCGlobal Email to Yahoo Easily
PDF
medical staffing services at VALiNTRY
PDF
Navsoft: AI-Powered Business Solutions & Custom Software Development
PDF
2025 Textile ERP Trends: SAP, Odoo & Oracle
PDF
Softaken Excel to vCard Converter Software.pdf
PDF
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
PPTX
Transform Your Business with a Software ERP System
PDF
Odoo Companies in India – Driving Business Transformation.pdf
PDF
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
PPTX
CHAPTER 2 - PM Management and IT Context
PPTX
Introduction to Artificial Intelligence
Design an Analysis of Algorithms II-SECS-1021-03
Understanding Forklifts - TECH EHS Solution
Upgrade and Innovation Strategies for SAP ERP Customers
Nekopoi APK 2025 free lastest update
L1 - Introduction to python Backend.pptx
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
Operating system designcfffgfgggggggvggggggggg
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
How to Migrate SBCGlobal Email to Yahoo Easily
medical staffing services at VALiNTRY
Navsoft: AI-Powered Business Solutions & Custom Software Development
2025 Textile ERP Trends: SAP, Odoo & Oracle
Softaken Excel to vCard Converter Software.pdf
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
Transform Your Business with a Software ERP System
Odoo Companies in India – Driving Business Transformation.pdf
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
CHAPTER 2 - PM Management and IT Context
Introduction to Artificial Intelligence

A report on designing a model for improving CPU Scheduling by using Machine Learning

  • 1. 2 OPERATING SYSTEM REPORT TITLE DESIGNING A MODEL FOR IMPROVING CPU SCHEDULING BY USING MACHINE LEARNING SUBMITTED BY MUSKAN RATH IIIT Bhubaneswar
  • 2. 3 OPERATING SYSTEM REPORT TITLE DESIGNING A MODEL FOR IMPROVING CPU SCHEDULING BY USING MACHINE LEARNING Problem Statement The main concern of our paper describes that we are proposing a model for a uniprocessor system for improving CPU scheduling. Our model is implemented at low-level language or assembly language and LINUX is used for the implementation of the model as it is an open-source environment and its kernel is editable. There are several methods to predict the length of the CPU-bursts, such as exponential averaging method, however these methods may not give an accurate or reliable predicted values. In this paper, we will propose a Machine Learning (ML) based best approach to estimate the length of the CPU-bursts for processes. We will make use of Bayesian Theory for our model as a classifier tool that will decide which process will execute first in the ready queue. The proposed approach aims to select the most significant attributes of the process using feature selection techniques and then predicts the CPU-burst for the process in the grid. Furthermore, applying attribute selection techniques improves the performance in terms of space, time and estimation. Material and Methods 1.Bayesian Decision Theory The methodology of selecting process which will be first executed on the ready queue involves two phases. They are comparison of static and dynamic properties of new processes in the queue with the properties of the dataset of the previously executed process. Data comes from a process that is not completely known. The dataset is divided into two categories of processes; useful processes and not- useful processes. The new process will be categorized as either useful or not useful depending on the results of the comparison of properties. Furthermore, we can find the probability of the new process in the ready queue to be executed given that the previous process has been executed as: P(A|B)=P(A∩B)/P(B)
  • 3. 4 => P(A∩B)=P(A|B).P(B) --------(i) Also, P(B|A)=P(B∩A)/P(A) => P(B∩A)=P(B|A).P(A) ----------(ii) Since (i) and (ii) are equal. We have, P(A|B).P(B)=P(B|A).P(A) In the above formula, A is a new process which is yet to be executed and B is a previous process which has been executed. The process A is a hypothesis while process B is the evidence or data. In the above case, we are finding the probability of hypothesis when we have given some evidence or data. By knowing the probability of a new process occurring we can optimise CPU Scheduling. But if wrong decision is taken, then there might be losses. A decision rule α(.) takes input x and outputs a decision α(x). We will usually require that α(.) lies in a class of decision rules A, i.e. α(.) ∈ A. A is sometimes called the hypothesis class. In Bayes Decision Theory there are usually no restrictions placed on A (i.e. all rules α(.) are allowed). In Machine Learning, we will usually put restrictions on A to ensure that we have enough data to learn them. The loss function L(α(x), y) is the cost you pay if you make decision α(x), but the true state is y.To put everything together, we have : likelihood function: p(x|y) x ∈ X, y ∈ Y prior: p(y) decision rule: α(x) α(x) ∈ Y loss function: L(α(x), y) cost of making decision α(x) when true state is y. The risk function combines the loss function, the decision rule, and the probabilities. More precisely, the risk of a decision rule α(.) is the expected loss L(., .) with respect to the probabilities p(., .). R(α) = X x,y L(α(x), y)P(x, y) (Note: if x takes continuous values (instead of discrete values) then we replace P x,y by P y R dx.) According to Bayes Decision Theory one has to pick the decision rule ˆα which minimizes the risk. αˆ = arg min α∈A R(α), i.e. R(ˆα) ≥ R(α) ∀α ∈ A (set of all decision rules). αˆ is the Bayes Decision R(ˆα) is the Bayes Risk. 2.Proposed Approach Our purpose in the proposed model is to reduce the possibility of selecting an inappropriate process that may increase the waiting time of all other processes waiting for CPU. Furthermore, (throughput) will be
  • 4. 5 decreased on selecting the process which will take maximum time of CPU. Bayesian Decision Theory (BDT), works on previous knowledge and distribution of the data from which we have to select the appropriate data item expecting to achieve the target. Our model proposes the data set of 100 execution instances of five programs: (1) matrix multiplication, (2) quick sort, (3) merges sort, (4) heap sort and (5) a recursive Fibonacci number generator. The data collection may be performed by saving the process control blocks of the executed processes. Data of about 100 instances of the five programs is enough and made into 02 categories; useful and not-useful processes. Training and Testing methodology: We proposed two types of tests on the training examples with all the learners described in the section, BDT will be applied as classifier, on the data sets collected in the first phase. The tests are: Use Training Set: The classifier is evaluated on how well it predicts the class of the instance it was trained on. Cross-Validation: The classifier can be evaluated by cross-validation, using the number of processes that are entered in the system. Recognition accuracy can be tested via cross validation. BASIC APPROACH 1) The programs are run according to different time slices using the scheduler in order to find the best STS that is the best time slice which will help in giving the minimum turnaround time , that is the minimum time required to complete the entire process of a CPU Scheduling. 2) Taking the help of basic static and dynamic properties of our process, we fetch the properties to the BDT(Bayesian Classifier) in order to classify the processes into useful and non useful categories in order to help us determine which process should be scheduled first. 3) If a new program comes, classify it and run the program with this predicted STS. 4) If the new program instance is not in the knowledge-base, go to step 1. 5) BDT works as effective classifier to classify the process which may or may not be useful process for the system from both user and system point of view. The BDT is solely based on probabilistic and statistical data so as a result the ratio of accuracy of selecting the appropriate process may vary from time to time.
  • 5. 6 A variety of criteria are used in designing the real-time scheduler. Some of these criteria relate to the behavior of the system as perceived by the individual user (user oriented), while others view the total effectiveness of the system in meeting the needs of all users (system oriented). Some of the criteria relate specifically to quantitative measures of performance, while others are more qualitative in nature. From a user’s point of view, response time is generally the most important characteristic of a system, while from a system point of view, throughput or processor utilization is important. In this work, BDT works as effective classifier to classify the process which may or may not be useful process for the system from both user and system point of view. The BDT is solely based on probabilistic and statistical data so as a result the ratio of accuracy of selecting the appropriate process may vary from time to time.
  • 6. 7