SlideShare a Scribd company logo
Supporting Program
Comprehension with Source
   Code Summarization
     Sonia Haiduc*, Jairo Aponte**, Andrian Marcus*

                    ICSE NIER 2010



 *                                          **
Developers read source code

• Before performing maintenance on a
  system, developers need to understand
  its source code

• During comprehension, programmers
  search and browse the code
Skimming vs. reading code
• Skimming (Starke’09): quickly reading the names of
  software artifacts
  + Fast
  – Insufficient information
  – Shallow understanding

• Reading in depth
   – Slow
   – Too much information
   + Deeper understanding
Code summaries

• Automatically generated, short, yet accurate
  descriptions of source code entities

• They give more information than just the
  header or the name of an artifact

• Significantly shorter and faster to read than
  the source code they summarize
What should we summarize?
• Code
   –   Packages
   –   Classes
   –   Methods
   –   Method sequences
   –   Etc.

• Other artifacts
   – Bug reports (ICSE 2010 - S. Rastakar, G. Murphy, G. Murray)
   – E-mails
   – Etc.
What should we include
         in code summaries?

• Semantic information
  – What does the source code do?
  – Identifiers and comments that capture the main concepts


• Structural information
  – How does the code work?
  – Class relationships, callers and callees, members of a
    class, etc.
Description: VFS virtual file system read write
              mkdir directory path save      +
Internal classes: DirectoryEntry             +
Methods: listDirectory, mkdir, constructPath +
Fields: WRITE_CAP, READ_CAP, lock            +
Sub-classes: FileVFS, FavoritesVFS           +
Other: ...
How should we generate
        code summaries?

• Semantic information: automatic text
  summarization
  – Machine Learning
  – Discourse-based approaches
  – Term-based Text Retrieval techniques


• Structural information: static analysis
How can we evaluate code
          summaries?

• How good are the automatic summaries
  when compared to manual ones?

• How useful are the automatic code
  summaries for SE tasks?
Preliminary evaluation

• Compared automatic code summaries
  with developer code summaries

• 6 developers, 12 methods in ATunes

• Used only lexical information – 5 most
  relevant terms
Results
• Automatic source code summaries good in
  reflecting developers’ summaries

• Text Retrieval techniques work as well on
  source code as on natural language in reflecting
  human summaries

• Developers make use of structural information in
  their code summaries:
  – Method name terms
  – Class name terms
  – Formal parameter types terms
What are we doing now?

• What type and how much structural
  information should be included in code
  summaries?
• How do developers generate summaries?
• Are different summaries needed for
  different tasks?
• How useful are the code summaries for
  SE tasks?, etc.
In summary…
• Automatic code summaries:
  –   Short yet accurate descriptions of source code
  –   Can reduce the effort of program comprehension
  –   Embed both semantic and structural information
  –   Can be generated for a variety of software entities

• Visit my poster
  (HINT: look for the huge and colorful one)
• www.cs.wayne.edu/~severe and
  www.cs.wayne.edu/~shaiduc
• sonja@wayne.edu
Ad

Recommended

Supporting program comprehension with source code summarization
Supporting program comprehension with source code summarization
Masud Rahman
 
Summarization Techniques for Code, Changes, and Testing
Summarization Techniques for Code, Changes, and Testing
Sebastiano Panichella
 
Mit4021–%20 c# and .net
Mit4021–%20 c# and .net
smumbahelp
 
EE5440 – Computer Architecture Course Outline
EE5440 – Computer Architecture Course Outline
Dilawar Khan
 
Sudeep-Resume
Sudeep-Resume
Sudeep S
 
Coding standards
Coding standards
Mimoh Ojha
 
Resume upto august 2016
Resume upto august 2016
Chandan Raj
 
GENERIC CODE CLONING METHOD FOR DETECTION OF CLONE CODE IN SOFTWARE DEVELOPMENT
GENERIC CODE CLONING METHOD FOR DETECTION OF CLONE CODE IN SOFTWARE DEVELOPMENT
IAEME Publication
 
Algorithms and Application Programming
Algorithms and Application Programming
ahaleemsl
 
Euro python 2015 writing quality code
Euro python 2015 writing quality code
radek_j
 
Mca 108
Mca 108
smumbahelp
 
Chap 1-dhamdhere system programming
Chap 1-dhamdhere system programming
TanzoGamerz
 
Intelligent Hiring with Resume Parser and Ranking using Natural Language Proc...
Intelligent Hiring with Resume Parser and Ranking using Natural Language Proc...
Zainul Sayed
 
IRJET- Querying Database using Natural Language Interface
IRJET- Querying Database using Natural Language Interface
IRJET Journal
 
Topic modeling
Topic modeling
Sajal Sharma
 
Resume
Resume
Dilip Simha Chintamani Rajagopal
 
Resume parser
Resume parser
Akrita Agarwal
 
Mca 204
Mca 204
smumbahelp
 
Ramakeerthi_1+yr_resume
Ramakeerthi_1+yr_resume
botcha ramakeerthi
 
Performance Evaluation List
Performance Evaluation List
Ievgen Kuzminov
 
Intro lecture infs429
Intro lecture infs429
Edmund Sowah
 
Python - code quality and production monitoring
Python - code quality and production monitoring
David Melamed
 
Project report
Project report
Utkarsh Soni
 
Research software identification - Catherine Jones
Research software identification - Catherine Jones
Jisc RDM
 
Tips to kick-start your Software Engineering Career - Ferdous Mahmud Shaon
Tips to kick-start your Software Engineering Career - Ferdous Mahmud Shaon
Cefalo
 
Tips to Kick-start your Software Engineering Career
Tips to Kick-start your Software Engineering Career
Ferdous Mahmud Shaon
 
Code Inspection
Code Inspection
Fáber D. Giraldo
 
Towards Reusable Research Software
Towards Reusable Research Software
dgarijo
 
Page 18Goal Implement a complete search engine. Milestones.docx
Page 18Goal Implement a complete search engine. Milestones.docx
smile790243
 
The Final Frontier
The Final Frontier
jClarity
 

More Related Content

What's hot (15)

Algorithms and Application Programming
Algorithms and Application Programming
ahaleemsl
 
Euro python 2015 writing quality code
Euro python 2015 writing quality code
radek_j
 
Mca 108
Mca 108
smumbahelp
 
Chap 1-dhamdhere system programming
Chap 1-dhamdhere system programming
TanzoGamerz
 
Intelligent Hiring with Resume Parser and Ranking using Natural Language Proc...
Intelligent Hiring with Resume Parser and Ranking using Natural Language Proc...
Zainul Sayed
 
IRJET- Querying Database using Natural Language Interface
IRJET- Querying Database using Natural Language Interface
IRJET Journal
 
Topic modeling
Topic modeling
Sajal Sharma
 
Resume
Resume
Dilip Simha Chintamani Rajagopal
 
Resume parser
Resume parser
Akrita Agarwal
 
Mca 204
Mca 204
smumbahelp
 
Ramakeerthi_1+yr_resume
Ramakeerthi_1+yr_resume
botcha ramakeerthi
 
Performance Evaluation List
Performance Evaluation List
Ievgen Kuzminov
 
Intro lecture infs429
Intro lecture infs429
Edmund Sowah
 
Python - code quality and production monitoring
Python - code quality and production monitoring
David Melamed
 
Project report
Project report
Utkarsh Soni
 
Algorithms and Application Programming
Algorithms and Application Programming
ahaleemsl
 
Euro python 2015 writing quality code
Euro python 2015 writing quality code
radek_j
 
Chap 1-dhamdhere system programming
Chap 1-dhamdhere system programming
TanzoGamerz
 
Intelligent Hiring with Resume Parser and Ranking using Natural Language Proc...
Intelligent Hiring with Resume Parser and Ranking using Natural Language Proc...
Zainul Sayed
 
IRJET- Querying Database using Natural Language Interface
IRJET- Querying Database using Natural Language Interface
IRJET Journal
 
Performance Evaluation List
Performance Evaluation List
Ievgen Kuzminov
 
Intro lecture infs429
Intro lecture infs429
Edmund Sowah
 
Python - code quality and production monitoring
Python - code quality and production monitoring
David Melamed
 

Similar to Supporting program comprehension with source code summarization icse nier 2010 (20)

Research software identification - Catherine Jones
Research software identification - Catherine Jones
Jisc RDM
 
Tips to kick-start your Software Engineering Career - Ferdous Mahmud Shaon
Tips to kick-start your Software Engineering Career - Ferdous Mahmud Shaon
Cefalo
 
Tips to Kick-start your Software Engineering Career
Tips to Kick-start your Software Engineering Career
Ferdous Mahmud Shaon
 
Code Inspection
Code Inspection
Fáber D. Giraldo
 
Towards Reusable Research Software
Towards Reusable Research Software
dgarijo
 
Page 18Goal Implement a complete search engine. Milestones.docx
Page 18Goal Implement a complete search engine. Milestones.docx
smile790243
 
The Final Frontier
The Final Frontier
jClarity
 
Dice.com Bay Area Search - Beyond Learning to Rank Talk
Dice.com Bay Area Search - Beyond Learning to Rank Talk
Simon Hughes
 
"Hands Off! Best Practices for Code Hand Offs"
"Hands Off! Best Practices for Code Hand Offs"
Naomi Dushay
 
Automatic and rapid generation of massive knowledge repositories from data
Automatic and rapid generation of massive knowledge repositories from data
SIKM
 
Introducing Systems Analysis Design Development
Introducing Systems Analysis Design Development
bsadd
 
Software citation
Software citation
Daniel S. Katz
 
Introducing systems analysis, design & development Concepts
Introducing systems analysis, design & development Concepts
Shafiul Azam Chowdhury
 
Autopsy 3.0 - Open Source Digital Forensics Conference
Autopsy 3.0 - Open Source Digital Forensics Conference
Basis Technology
 
Hire a Machine to Code - Michael Arthur Bucko & Aurélien Nicolas
Hire a Machine to Code - Michael Arthur Bucko & Aurélien Nicolas
WithTheBest
 
Implementation of an Artificial Intelligence Powered Code Editor
Implementation of an Artificial Intelligence Powered Code Editor
omu54321
 
CS6007 information retrieval - 5 units notes
CS6007 information retrieval - 5 units notes
Anandh Arumugakan
 
Information Architecture Explained
Information Architecture Explained
Leigh White
 
Object Pascal Clean Code Guidelines Proposal (at EKON 22)
Object Pascal Clean Code Guidelines Proposal (at EKON 22)
Arnaud Bouchez
 
Xen Project Contributor Training - Part 1 introduction v1.0
Xen Project Contributor Training - Part 1 introduction v1.0
The Linux Foundation
 
Research software identification - Catherine Jones
Research software identification - Catherine Jones
Jisc RDM
 
Tips to kick-start your Software Engineering Career - Ferdous Mahmud Shaon
Tips to kick-start your Software Engineering Career - Ferdous Mahmud Shaon
Cefalo
 
Tips to Kick-start your Software Engineering Career
Tips to Kick-start your Software Engineering Career
Ferdous Mahmud Shaon
 
Towards Reusable Research Software
Towards Reusable Research Software
dgarijo
 
Page 18Goal Implement a complete search engine. Milestones.docx
Page 18Goal Implement a complete search engine. Milestones.docx
smile790243
 
The Final Frontier
The Final Frontier
jClarity
 
Dice.com Bay Area Search - Beyond Learning to Rank Talk
Dice.com Bay Area Search - Beyond Learning to Rank Talk
Simon Hughes
 
"Hands Off! Best Practices for Code Hand Offs"
"Hands Off! Best Practices for Code Hand Offs"
Naomi Dushay
 
Automatic and rapid generation of massive knowledge repositories from data
Automatic and rapid generation of massive knowledge repositories from data
SIKM
 
Introducing Systems Analysis Design Development
Introducing Systems Analysis Design Development
bsadd
 
Introducing systems analysis, design & development Concepts
Introducing systems analysis, design & development Concepts
Shafiul Azam Chowdhury
 
Autopsy 3.0 - Open Source Digital Forensics Conference
Autopsy 3.0 - Open Source Digital Forensics Conference
Basis Technology
 
Hire a Machine to Code - Michael Arthur Bucko & Aurélien Nicolas
Hire a Machine to Code - Michael Arthur Bucko & Aurélien Nicolas
WithTheBest
 
Implementation of an Artificial Intelligence Powered Code Editor
Implementation of an Artificial Intelligence Powered Code Editor
omu54321
 
CS6007 information retrieval - 5 units notes
CS6007 information retrieval - 5 units notes
Anandh Arumugakan
 
Information Architecture Explained
Information Architecture Explained
Leigh White
 
Object Pascal Clean Code Guidelines Proposal (at EKON 22)
Object Pascal Clean Code Guidelines Proposal (at EKON 22)
Arnaud Bouchez
 
Xen Project Contributor Training - Part 1 introduction v1.0
Xen Project Contributor Training - Part 1 introduction v1.0
The Linux Foundation
 
Ad

Recently uploaded (20)

Great Governors' Send-Off Quiz 2025 Prelims IIT KGP
Great Governors' Send-Off Quiz 2025 Prelims IIT KGP
IIT Kharagpur Quiz Club
 
NSUMD_M1 Library Orientation_June 11, 2025.pptx
NSUMD_M1 Library Orientation_June 11, 2025.pptx
Julie Sarpy
 
A Visual Introduction to the Prophet Jeremiah
A Visual Introduction to the Prophet Jeremiah
Steve Thomason
 
LDMMIA Yoga S10 Free Workshop Grad Level
LDMMIA Yoga S10 Free Workshop Grad Level
LDM & Mia eStudios
 
INDUCTIVE EFFECT slide for first prof pharamacy students
INDUCTIVE EFFECT slide for first prof pharamacy students
SHABNAM FAIZ
 
Public Health For The 21st Century 1st Edition Judy Orme Jane Powell
Public Health For The 21st Century 1st Edition Judy Orme Jane Powell
trjnesjnqg7801
 
June 2025 Progress Update With Board Call_In process.pptx
June 2025 Progress Update With Board Call_In process.pptx
International Society of Service Innovation Professionals
 
ENGLISH_Q1_W1 PowerPoint grade 3 quarter 1 week 1
ENGLISH_Q1_W1 PowerPoint grade 3 quarter 1 week 1
jutaydeonne
 
Peer Teaching Observations During School Internship
Peer Teaching Observations During School Internship
AjayaMohanty7
 
HistoPathology Ppt. Arshita Gupta for Diploma
HistoPathology Ppt. Arshita Gupta for Diploma
arshitagupta674
 
Values Education 10 Quarter 1 Module .pptx
Values Education 10 Quarter 1 Module .pptx
JBPafin
 
CRYPTO TRADING COURSE BY FINANCEWORLD.IO
CRYPTO TRADING COURSE BY FINANCEWORLD.IO
AndrewBorisenko3
 
How to Customize Quotation Layouts in Odoo 18
How to Customize Quotation Layouts in Odoo 18
Celine George
 
Aprendendo Arquitetura Framework Salesforce - Dia 02
Aprendendo Arquitetura Framework Salesforce - Dia 02
Mauricio Alexandre Silva
 
Romanticism in Love and Sacrifice An Analysis of Oscar Wilde’s The Nightingal...
Romanticism in Love and Sacrifice An Analysis of Oscar Wilde’s The Nightingal...
KaryanaTantri21
 
LAZY SUNDAY QUIZ "A GENERAL QUIZ" JUNE 2025 SMC QUIZ CLUB, SILCHAR MEDICAL CO...
LAZY SUNDAY QUIZ "A GENERAL QUIZ" JUNE 2025 SMC QUIZ CLUB, SILCHAR MEDICAL CO...
Ultimatewinner0342
 
2025 June Year 9 Presentation: Subject selection.pptx
2025 June Year 9 Presentation: Subject selection.pptx
mansk2
 
THE PSYCHOANALYTIC OF THE BLACK CAT BY EDGAR ALLAN POE (1).pdf
THE PSYCHOANALYTIC OF THE BLACK CAT BY EDGAR ALLAN POE (1).pdf
nabilahk908
 
K12 Tableau User Group virtual event June 18, 2025
K12 Tableau User Group virtual event June 18, 2025
dogden2
 
Birnagar High School Platinum Jubilee Quiz.pptx
Birnagar High School Platinum Jubilee Quiz.pptx
Sourav Kr Podder
 
Great Governors' Send-Off Quiz 2025 Prelims IIT KGP
Great Governors' Send-Off Quiz 2025 Prelims IIT KGP
IIT Kharagpur Quiz Club
 
NSUMD_M1 Library Orientation_June 11, 2025.pptx
NSUMD_M1 Library Orientation_June 11, 2025.pptx
Julie Sarpy
 
A Visual Introduction to the Prophet Jeremiah
A Visual Introduction to the Prophet Jeremiah
Steve Thomason
 
LDMMIA Yoga S10 Free Workshop Grad Level
LDMMIA Yoga S10 Free Workshop Grad Level
LDM & Mia eStudios
 
INDUCTIVE EFFECT slide for first prof pharamacy students
INDUCTIVE EFFECT slide for first prof pharamacy students
SHABNAM FAIZ
 
Public Health For The 21st Century 1st Edition Judy Orme Jane Powell
Public Health For The 21st Century 1st Edition Judy Orme Jane Powell
trjnesjnqg7801
 
ENGLISH_Q1_W1 PowerPoint grade 3 quarter 1 week 1
ENGLISH_Q1_W1 PowerPoint grade 3 quarter 1 week 1
jutaydeonne
 
Peer Teaching Observations During School Internship
Peer Teaching Observations During School Internship
AjayaMohanty7
 
HistoPathology Ppt. Arshita Gupta for Diploma
HistoPathology Ppt. Arshita Gupta for Diploma
arshitagupta674
 
Values Education 10 Quarter 1 Module .pptx
Values Education 10 Quarter 1 Module .pptx
JBPafin
 
CRYPTO TRADING COURSE BY FINANCEWORLD.IO
CRYPTO TRADING COURSE BY FINANCEWORLD.IO
AndrewBorisenko3
 
How to Customize Quotation Layouts in Odoo 18
How to Customize Quotation Layouts in Odoo 18
Celine George
 
Aprendendo Arquitetura Framework Salesforce - Dia 02
Aprendendo Arquitetura Framework Salesforce - Dia 02
Mauricio Alexandre Silva
 
Romanticism in Love and Sacrifice An Analysis of Oscar Wilde’s The Nightingal...
Romanticism in Love and Sacrifice An Analysis of Oscar Wilde’s The Nightingal...
KaryanaTantri21
 
LAZY SUNDAY QUIZ "A GENERAL QUIZ" JUNE 2025 SMC QUIZ CLUB, SILCHAR MEDICAL CO...
LAZY SUNDAY QUIZ "A GENERAL QUIZ" JUNE 2025 SMC QUIZ CLUB, SILCHAR MEDICAL CO...
Ultimatewinner0342
 
2025 June Year 9 Presentation: Subject selection.pptx
2025 June Year 9 Presentation: Subject selection.pptx
mansk2
 
THE PSYCHOANALYTIC OF THE BLACK CAT BY EDGAR ALLAN POE (1).pdf
THE PSYCHOANALYTIC OF THE BLACK CAT BY EDGAR ALLAN POE (1).pdf
nabilahk908
 
K12 Tableau User Group virtual event June 18, 2025
K12 Tableau User Group virtual event June 18, 2025
dogden2
 
Birnagar High School Platinum Jubilee Quiz.pptx
Birnagar High School Platinum Jubilee Quiz.pptx
Sourav Kr Podder
 
Ad

Supporting program comprehension with source code summarization icse nier 2010

  • 1. Supporting Program Comprehension with Source Code Summarization Sonia Haiduc*, Jairo Aponte**, Andrian Marcus* ICSE NIER 2010 * **
  • 2. Developers read source code • Before performing maintenance on a system, developers need to understand its source code • During comprehension, programmers search and browse the code
  • 3. Skimming vs. reading code • Skimming (Starke’09): quickly reading the names of software artifacts + Fast – Insufficient information – Shallow understanding • Reading in depth – Slow – Too much information + Deeper understanding
  • 4. Code summaries • Automatically generated, short, yet accurate descriptions of source code entities • They give more information than just the header or the name of an artifact • Significantly shorter and faster to read than the source code they summarize
  • 5. What should we summarize? • Code – Packages – Classes – Methods – Method sequences – Etc. • Other artifacts – Bug reports (ICSE 2010 - S. Rastakar, G. Murphy, G. Murray) – E-mails – Etc.
  • 6. What should we include in code summaries? • Semantic information – What does the source code do? – Identifiers and comments that capture the main concepts • Structural information – How does the code work? – Class relationships, callers and callees, members of a class, etc.
  • 7. Description: VFS virtual file system read write mkdir directory path save + Internal classes: DirectoryEntry + Methods: listDirectory, mkdir, constructPath + Fields: WRITE_CAP, READ_CAP, lock + Sub-classes: FileVFS, FavoritesVFS + Other: ...
  • 8. How should we generate code summaries? • Semantic information: automatic text summarization – Machine Learning – Discourse-based approaches – Term-based Text Retrieval techniques • Structural information: static analysis
  • 9. How can we evaluate code summaries? • How good are the automatic summaries when compared to manual ones? • How useful are the automatic code summaries for SE tasks?
  • 10. Preliminary evaluation • Compared automatic code summaries with developer code summaries • 6 developers, 12 methods in ATunes • Used only lexical information – 5 most relevant terms
  • 11. Results • Automatic source code summaries good in reflecting developers’ summaries • Text Retrieval techniques work as well on source code as on natural language in reflecting human summaries • Developers make use of structural information in their code summaries: – Method name terms – Class name terms – Formal parameter types terms
  • 12. What are we doing now? • What type and how much structural information should be included in code summaries? • How do developers generate summaries? • Are different summaries needed for different tasks? • How useful are the code summaries for SE tasks?, etc.
  • 13. In summary… • Automatic code summaries: – Short yet accurate descriptions of source code – Can reduce the effort of program comprehension – Embed both semantic and structural information – Can be generated for a variety of software entities • Visit my poster (HINT: look for the huge and colorful one) • www.cs.wayne.edu/~severe and www.cs.wayne.edu/~shaiduc • [email protected]