Constrained text generation to measure reading performance: A new approach based on constraint programming

Constrained text generation to measure reading performance:
A new approach based on constraint programming
JOINT WORK WITH ALEXANDRE BONLARRON(1,3), AURÉLIE CALABRÈSE(2), PIERRE KORNPROBST(1).
1
Jean-Charles Régin(3)
(1) Université Côte d’Azur, Inria, France
(2) Aix Marseille Université, CNRS, LPC, Marseille, France
(3) Université Côte d’Azur, I3S, France

• Standardized text: sentences read at the same speed
• Usability: to assess reading performance
Standardized Text
(Mansfield et al., 1993)
3

Constrained text generation
This is a rules (constraints) dominated problem
MNREAD Chart
=
4

MNREAD Rules
Display Rules
Lexical Rules
Grammatical Rules
Length Rules
E.g. Entering the rectangle
E.g. 3000 words from CE2 textbooks
E.g. No Punctuation
E.g. 60 characters, between 9 and 15 words
Example of a MNREAD sentence
There are 38 MNREAD sentences in French
5

Are there enough sentences?
No, a few thousand sentences are needed to detect and
monitor visual pathology throughout life.
Is it really difficult to have more sentences that
respect the rules ?
Questions :
There are 38 MNREAD phrases in French.
6

Naive method
Search for MNREAD type sentences in a corpus.
Problem : This method does not scale up
Solution : We have to generate them, but how ?
Display
Lexicon
Grammar
Length
1 000 000 sentences
10 000 sentences
6 sentences
2300 books 10 000 000 sentences 3 sentences
7

How to generate standardized sentences?
8
LLM-based approach (GPT, BERT) + SEARCH := good text quality, but unlikely to find in an
instance that satisfies the constraints.
Generates word-by-word sentences and selects the next word as the most suitable (use token
instead of words)
prompt (ChatGPT 3.5): give me a sentence of sixty characters with spaces included
“Elephants march majestically through the savannah at sunset, their presence captivating”
prompt (ChatGPT 3.5): give me a sentence of sixty characters
“The cat sat on the mat and purred softly”

• One "good" sentence out of 8000 (only); remembering bias
• A semi-automatic method for the English language
• Non-trivial extension to Latin languages!
mon ami est beau; mon amie est belle
Ad hoc method: a recent method proposed by the creators of MNREAD and based on hand-
defined models (Mansfield et al., 2019).
9

n-gram based methods (Papadopoulos et al., 2015)
Corpus n-gram Génération
When using with a random walk it produces sentences in the style of an author
Problems:
How to integrate constraints?
How to manage the meaning of the sentences?

• A generalization of Binary Decision Diagrams (BDD)
• Each layer represents a variable
• Each path between root and tt is a valid assignment of the variables
• An MDD models all tuples that satisfy a constraint
Multi-Valued Decision Diagram (MDD)
11
MDD having 3 solutions :
(a,b) (a,a) (b,b)
• Data structures for calculating and storing problem solutions in a
compressed form using an acyclic directed graph
• Advantage: Powerful modeling tools. With one billion of arcs we can
represent 10^90 solutions!

MDD and compression
● Sum of 3 variables
● Corresponds to an automata
● The last layer is the sum value

Reduction
● Operation which merges equivalent
nodes
● Two nodes are equivalent if they
have
○ the same outgoing edges
(same destination + same labels)
root
a
0
b
1
c
0 1
d
2
e
0 1
0 1 1 0 1
● Minimization of finite automata
tt

Reduction
● Operation which merges equivalent
nodes
● Two nodes are equivalent if they
have
○ the same outgoing edges
(same destination + same labels)
root
a
0
b
1
ce
d
2 0 1 1 0
1 1 0
● Minimization of finite automata
tt

Reduction
● Reduction may gain an exponential factor
● Consequence:
○ MDD can be exponentially smaller than an equivalent automaton

Compression gain
● Compression may gain an exponential factor
● It often does!
● Example
○ MDD requiring 600,000 edges for representing 10^90 solutions, that is a
compression factor of 10^86.
● Sometimes it can be subtle

Alldiff constraint
● #node = 2^n
● #solutions = n!
● n!/2^n ???
● is exponential

MDD: creation
● MDD can be created without enumerating the solution set
● Can be created from Dynamic Programming
● Kind of Search compression
● So what?
● Operations!

MDD: operations
● Intersection, union, difference, negation etc…
● Operations are performed without decompression
● Intersection of 2 MDDs is equivalent to make the conjunction of the 2
constraints represented by the MDDs
● Relation between MDD operations and constraints combination
○ Intersection : conjunction
○ Union : disjunction
○ Negation : negation

Intersection
root1
a
0
b
1
ce
d
2 0 1 1 0
tt1
1 1 0
root2
k
0 2
l
0 2
tt2
1 2
root
1.20

Intersection
root1
a
0
b
1
ce
d
2 0 1 1 0
tt1
1 1 0
root2
k
0 2
l
0 2
tt2
1 2
root
ak'
0
1.21

Intersection
root1
a
0
b
1
ce
d
2 0 1 1 0
tt1
1 1 0
root2
k
0 2
l
0 2
tt2
1 2
root
ak'
0
1.22

Intersection
root1
a
0
b
1
ce
d
2 0 1 1 0
tt1
1 1 0
root2
k
0 2
l
0 2
tt2
1 2
root
ak'
0
1.23

Intersection
root1
a
0
b
1
ce
d
2 0 1 1 0
tt1
1 1 0
root2
k
0 2
l
0 2
tt2
1 2
root
ak'
0
cel'
0
1.24

Intersection
root1
a
0
b
1
ce
d
2 0 1 1 0
tt1
1 1 0
root2
k
0 2
l
0 2
tt2
1 2
root
ak'
0
cel'
0
1.25

Intersection
root1
a
0
b
1
ce
d
2 0 1 1 0
tt1
1 1 0
root2
k
0 2
l
0 2
tt2
1 2
root
ak'
0
cel'
0
dl'
2
1.26

Intersection
root1
a
0
b
1
ce
d
2 0 1 1 0
tt1
1 1 0
root2
k
0 2
l
0 2
tt2
1 2
root
ak'
0
cel'
0
dl'
2
1.27

Intersection
root1
a
0
b
1
ce
d
2 0 1 1 0
tt1
1 1 0
root2
k
0 2
l
0 2
tt2
1 2
root
ak'
0
cel'
0
dl'
2
1.28

Intersection
root1
a
0
b
1
ce
d
2 0 1 1 0
tt1
1 1 0
root2
k
0 2
l
0 2
tt2
1 2
root
ak'
0
cel'
0
dl'
2
tt
1
1.29

Intersection
root1
a
0
b
1
ce
d
2 0 1 1 0
tt1
1 1 0
root2
k
0 2
l
0 2
tt2
1 2
root
ak'
0
cel'
0
dl'
2
tt
1
1.30

Intersection
root1
a
0
b
1
ce
d
2 0 1 1 0
tt1
1 1 0
root2
k
0 2
l
0 2
tt2
1 2
root
ak'
0
cel'
0
dl'
2
tt
1 1
1.31

Intersection
root1
a
0
b
1
ce
d
2 0 1 1 0
tt1
1 1 0
root2
k
0 2
l
0 2
tt2
1 2
root
ak'
0
cel'
0
dl'
2
tt
1 1
1.32

Intersection
● Be careful, do not think that the number of nodes/edges of the intersection
will be reduced
● The resulting MDD can be exponentially larger! Because it can be locally
decompressed

Operations
● In-place
● On-the-fly (i.e avoid having to define the MDD ina dvance and proceeds by
level)

First idea
store and retrieve n-grams efficiently
36

Successions Constraint
• Assuming all n-grams are inserted in
the MDD as solutions.
• MDD as a TRIE
• To store and reTRIEve n-grams.
What is the next word of The white cat ? 37

Second idea
Integrate constraints on-the-fly
38

- Using the first MDD (successions)
- We compile the second one
-Constraint are checked on-the-fly
The girl and the boy walked through the forest under t he majestic oak trees
39
MDD Unfolding (top-down)

The modeling properties of MDD leads to solve
the problem by representing each rule by an
MDD and by intersecting them.
40

Modeling
From Rules to MDD
9 to 15 words Language Restrictions (3000 lemma) 59 characters corpus
An arc is a word
An arc is a lemma
An arc is the number of characters of a word A state is the sum
An arc is a word A state is a k-gram
MDD Universel MDD Lexique MDD Size MDD Corpus
41

Intersection
From MDD to sentences
9 to 15 words Language Restrictions (3000 lemma) 59 characters corpus
MDD Taille MDD Corpus
MDD Universel MDD Lexique
# Le
Le sac
42
The intersections of MDD gives: Le sac noir

Third idea
Use an LLM to select best sentences
43

• Transformers (very large context window):
• Perplexity is derived from Shannon entropy.
• It quantify the uncertainty of a model with respect to a sample
• Lower the better, range is [1 ; + inf[
LLM sentences scoring : Perplexity
44

• Input : 443 books belonging to the youth category (FR)
• Input : 75 books belonging to the fiction category (EN)
• Evaluation :
• MNREAD candidate sentence set (syntax and meaning correct)
• Ineligible set of sentences (syntax and/or meaning problems)
• Software & Hardware :
• The model is implemented in Java 17 in an MDD solver (MDDLib) @I3S.
• The LLM use to rank sentences is GPT-2
• Machine: Ubuntu 18.04 using an Intel(R) Xeon(R) Gold 5222 @ 3.80GHz CPU and 256 GB RAM.
45
Experimental conditions

• With 1% of the corpus, i.e. 63000 sentences, in 3-grams we obtain 9899 sentences
•J'aimerais bien que le soleil commence à se rendre au salon
Phrases generated in 3-grams
Do we have sentences?
•Ils sont morts et les yeux sur le nom de ce que vous croyez
•Mes yeux se posent sur le nom qui lui a dit que vous croyez
•Ses mains n'étaient pas de sa mère dans ses bras et le même
•Aucun de ses pieds nus sur les yeux de ce qui ne se passera
•L'expression a pris un coup de poing et de leur sort demain
•Y en a pas de nous préparer à tout bout de sa petite bouche
•
•
J'en ai dit que si je vous en emparez et vous ne pouvez pas
Entrez là et tu as de ma part de sa main dans le monde voit
Bien que je ne veux pas que les yeux de ce que ça me plaira
I wish the sun would start coming into the living room
• Those sentences are not admissible.
• In 3-grams we produce a large majority of sentences with problems of meaning and syntax
• None of his bare feet on the eyes of what will happen
46

Are MNREAD sentences generated?
• YES ! In 5-grams , with 443 books , we generate hundreds of sentences (7028).
47

Are MNREAD sentences generated?
• YES ! In 5-grams , with 75 books , we generate hundreds of sentences (204).
48

Performances analysis
FR 443 3Go 72s 7028
EN 75 <<1Go 3s 204
MNREAD sentence generation
49
Scoring takes roughly 1 hours for 7000 sentences. GPT-2 (pylia)
Scoring takes roughly 30 mins for 7000 sentences. GPT-3 (OpenAI cloud)
Recent benchmark :
569.77 ms / 15 tokens ( 37.98 ms per token) ~~ 1 sentence, llama.cpp (comparable to
GPT-3) (pylia)

• Select sentences by using GPT2 or similar generative model.
• Examples (everybody can have a personal opinion about the score!):
• Very good : The two men looked at each other in a state of stupefaction (10)
• Moderatley good: The wolves had for the most part wholly ignorant of warfare (270)
• Bad: The farmer sat down on the Museum steps except the nice one (930)
• Poetic (medium): Il est tombé dans le vide avec une sorte de douceur absente (100)
• Complex:
• The aircraft will be as common as I can to hinder their way (380)
• The depth was very great and it seemed to me to do as I did (97)
Discussion

• Score are related to frequency of occurrence
• The wolves had for the most part wholly ignorant of warfare (272)
• We change words by using more frequent ones
• The wolves had for the most part completely ignored the war (90)
Discussion

English Ranking
52
The two men looked at each other in a state of stupefaction ,10.47
The wolves had for the most part wholly ignorant of warfare ,272

Constrained text generation to measure reading performance: A new approach based on constraint programming

• Generate sentences having only ten words (or 12, 11…): no problem
• Changing the level of vocabulary: no problem
• Modifying the size: no problem
• Other constraints: be careful with the combinatorics. If main constraints are relaxed then number of
solutions explodes!
New Constraints

• Promising method: more suitable than generic methods for handling constraints (e.g., GPT, Bert) and
more flexible than the ad-hoc method of Mansfield et al [3].
• Advantages: modularity (easy to add and/or remove rules), constraints taken into account at generation,
potentially applicable to other languages
• Perspectives: a perplexity constraint
55
Conclusion

Constrained text generation to measure reading performance: A new approach based on constraint programming

Recommended

More Related Content

Similar to Constrained text generation to measure reading performance: A new approach based on constraint programming (20)

More from Förderverein Technische Fakultät (20)

Recently uploaded (20)

Constrained text generation to measure reading performance: A new approach based on constraint programming