In the educational process, the disciplines D1, D2,...,Dn have a succession which is generated by the content and the final objective – the student formation.
In this work, the disciplines are presented as structured text entities. The graph associated to the disciplines is established. A method for dependencies evaluation is proposed.
The testing is done with 9 sets of representative input data.
Structured Text Entities
Text entities are used for storing and organizing texts representing very diversified
information. The naming of entity indicates the generality of the concept, as the
text represents information, which is able to be structured according to the origin
and the scope which is associated to this.
The following base concepts regarding text entities are defined:
·
the alphabet
A is a finite multitude formed of N symbols:a1, a2,
… , aN;
·
the separator
is a symbol which does not belong to the alphabet A, having the
role of delimitating two words that form a words sequence;
·
the word
is a succession formed out of symbols which follow one another. A
word cj is characterized by its length, lg(cj) expressed as
a number of characters that participate to the formation of word;
·
the vocabulary
VA is a multitude of different words. The length of
the vocabulary VA, noted as Lgv(VA) indicates the number of
words which participate to the formation of the vocabulary;
·
the text vocabulary
is constructed of a multitude of different words which appear
in the text. The text vocabulary, VT, is included in the vocabulary VA.
Sometimes, VT is identical with VA;
·
the frequency of apparition
for the word cj, noted as fj shows
the number of apparitions for the word cj in the text T. The frequency
of apparition for the symbol ai in the text T shows the number of apparitions
for this symbol and it is noted as gj;
·
the sub-vocabulary
is a part of the vocabulary constructed in such way that the
intersection of any sub-vocabulary pair leads to void multitude of elements. Sub-vocabularies
are disjoints multitudes of elements;
·
the text
T is a succession of words from vocabulary VA separated
by special symbols which are called separators. The text length Lgt(T) states the
number of words which form the text. The text length Lgts(T) states the number of
symbols which goes into the formation of the text T.
The entities which are based on
texts are actually constructions formed out of word sequences characterized by the
positions of words within the text, by the grouping of words for the purpose of
defining a context, by the making of a correspondence between the words and the
elements, actions and real world elements, with qualitative attributes that groups
concrete aspects from the reality in homogenous collectivities taking in consideration
predefined criterion.
In [IVAN05], there are presented elements which must taken in consideration
for constructing a text entity:
·
clear delimitation of the
tackled field;
·
the defining of key words
for the field; for a text entity developed in a field, there must be identified
those words that describe it in the most synthetic way;
·
the usage of the vocabulary
in which the key words vocabulary is also included;
·
the knowledge of the concepts,
techniques, methods, methodologies, technologies which are specific for the field;
·
documentation regarding
the detail elements and those connected to other fields of activity;
·
following the rules of the
syntax for each language;
·
following the rules regarding
text entities structure, gradual tackle of the problem, usage of standard formats
for representation of text typed information.
Some representative examples of text entities are: scientific, literal,
cultural words, web pages which are found on the Internet, the source code of the
software products, dictionaries, phone books and any other grouping entity that
exists as a text or is able to be structured as a text.
A particular case of text entity is represented by the list of disciplines
within the learning system. A discipline is a text entity, formed out of a multitude
of concepts that belong to it along with the definitions, demonstrations and corresponding
examples.
Each text entity ET is formed out of NT components noted
as SET1, SET2, … SETNT and each component has its
own vocabulary, which represents a sub-vocabulary for the parent entity. The text
entities structures are established based on the relations between entities, relations
that are formed taking into consideration the connections between the component
vocabularies, meaning the entities sub-vocabularies. In this way, for describing
the entities structures and also the connections which are formed between these
ones, it is necessary to analyze the base level, of words and words vocabulary.
Considering the vocabularies V1, V2, … VNV,
with
and
, it results that between the announced
vocabularies a independency relation is established, meaning that the concepts
have nothing in common. The graphic for this type of relation is presented in figure
1.
Fig. 1. Independency relation between vocabularies
Linear dependency relations between
vocabularies, as opposed to independency relations presented previously, indicate
the presence of some connections between vocabularies. These connections are condensed
in two categories:
-
full dependency, when the vocabularies
are fully included one in the other:
, case in which the concepts contained in the vocabularies
with lower index are assumed and further extended in vocabularies with higher index.
Such type of relation between vocabularies is presented in figure 2;
Fig. 2. Full dependency relation between vocabularies
-
partial dependency which forms when
the vocabularies are not totally included one in the other, but some concepts that
are assumed exist so that
,
,
,
and
. The partial dependency
relation between vocabularies is presented in figure 3.
Fig. 3. Partial dependency relation between vocabularies
Other types of relations form by combining the already defined ones,
so that in figure 4, a vocabulary which holds in its composition two independent
vocabularies is presented. It is the case in which
and
and
. The graphic representation is presented
in figure 4.
Fig. 4. Combination of dependency and independency relations between
vocabularies
When between two partial dependent vocabularies, the connection section
between them is fully dependent of other two vocabularies which are partially dependent
between themselves,
,
,, and the resulted structure is
graphically displayed in figure 5.
Fig. 5.Combination of partial and full dependency between entities
which are partial dependent and have another partial dependent relation between
entities in the connection section
For a practical implementation of the presented concepts, three text
entities are considered: the first one represented by the current work and the two
others extracted from the bibliography included in it, [IANA06] and [IVAN05]. The
graphical display of the structure that forms is presented in figure 6.
Fig. 6. Structure showing the relations between the current work
and two works papers included in its bibliography
According to the structure, the current work [DMA06], completely
includes the work [IANA06], meaning that it assumes all the concepts which are further
tackled and extended, but it also includes a part of the work [IVAN05], part that
is also divided in two categories:
-
the first category is the one included in the
current work, and which also exists in both [IANA06] and [IVAN05];
-
the second category is the one included in the
current work [DMA06] just from [IVAN05], without being presented the concepts in[IANA06].
The Graph Associated to The Entities
A dependency graph is a graph whose nodes are represented by different types of
entities among which there are distinguished some dependency relation by using arcs.
The precedence is a dependency which is transposed to the time line. Thus, some
operations which take place have precedence while the concepts within a learning
domain depend one on another in such way so that they must be preceded in the approach.
The graph node represents the text entity, or a component of it having
the corresponding vocabulary. The precedence is established either by using directional
arcs, when the resulted graph is directional or by using simple arcs with priority
decreasing from left to right and from up to down when the graph is not directional.
For the structures presented in chapter 1, the dependency graphs
associated are presented:
- for the structure presented in figure 1, the associated graph is
the one from the figure 7;
Fig. 7. Graph associated to the structure presented in figure 1
- for the structure that has the diagram in figure 2, the graph from
the figure 8 results;
Fig. 8. Graph associated to the structure presented in figure 2
- in figure 9, it is presented the graph associated to the structure
presented in figure 3;
Fig. 9. Graph associated to the structure presented in figure 3
- the graph for the structure in figure 4 is the one drawn in figure
10. V1 is dependent both on V2 and V3, which means
that in V1 all the concepts from V2 and V3 are
assumed and extended;
Fig. 10. Graph associated to the structure presented in figure 4
- the dependency graphs for the structure in figure 5, with the elimination
of the intersections between arcs, and without the elimination of these are presented
in figure 11 and 12 respectively. V3 and V4 are partially
dependent and they represent the base for entities V1 and V2
being completely included in their common section.
- for the structure presented in figure 6, the constructed dependency
graph is the one presented in figure 13..
Fig. 13. Graph associated to the structure presented in figure 6
The dependency graph is the same with the one in figure 8, this fact
showing that even if the dependency is represented, the quantity of elements in
connection does not have a specific meaning within the drawing.
Associated Metrics
In [IVAN05] a metric is defined as being a mathematical model with the following
form:
where:
-
y is a model which depends on the values x1,
x2, … , xnft for the factors Ft1, Ft2,
… , Ftnft;
-
xi is the numerical value for the
influential factor Fti;
-
Fti represents the influential factor
I from the multitude of factors that determine the variable which shows the result
y.
Through the mathematical model, the quantification of the characteristics
for the analyzed entity is obtained.
The work [BOJIO04] mentions that the metrics have the following functions:
·
measuring – values for the
elements from the text entity structure are distinguished;
·
comparing – the resemblances
and differences between two or many analyzed entities for classification or hierarchically
categorizing are pointed out;
·
analysis – has the role
of distinguishing the quality characteristics of the analyzed entities;
·
synthesis – consists in
extraction of what is essential for an analyzed text entity collectively;
·
estimation – future evolutions
of the behaviour for the analyzed text entity are established;
·
verification – implies the
validation of mathematical models associated to the metrics.
For measuring purposes, quality characteristics for text entities
are taken from [IANA06] and presented:
Documentation is a very important
quality characteristic. A text entity is defined by using an expert vocabulary
VT. to justify the documentation, the following must be taken into consideration
so that:
·
the bibliography’s article
titles must have as a base a vocabulary VB included in vocabulary VT;
·
the article words written
by the entity authors must form a vocabulary VA included in the vocabulary
VT.
The quality of progression
refers to the gradual nature of the approach.
The concepts are treated from closer to closer so that:
·
primary concepts that
are clarified by examples are considered;
·
based on primary concepts,
new concepts are defined, some of them being derived from others;
·
the connections between
concepts are assured by formulas, examples and diagrams;
·
the synthesis and analysis determine the particularization
and aggregation of all the presented elements;
·
the particularization is obtained from a definition
of concepts to another.
The consistency implies the existence
of some definitions, relations and presentations so that a logical succession is
possible to be obtained. The particularization level grows with the growing text.
The uniformity consists in the usage
with the same intensity of the bibliographic sources in the development of the concepts
as well.
If a text entity ET aims to tackle a specific field as
a synthesis, this implies that a bibliography formed out of titles G1,G2,
..., GH, where h represents the length of the bibliography expressed
as a number of used works, exists.
The uniformity means that the references of the works are done in
equal measure for each of them. The analysis of this quality characteristic implies
the parsing of ET entity for frequencies fi computing for
which the work Gih from the bibliography is referenced, i =1,2...
For the quality analysis of the entities, the following must be obtained:
·
the length of the entities ET in
total number of words;
·
the length of the entity ET in total
number of essential words, which is obtained by eliminating the connection words
and the words which are not modified regardless the context;
·
the length of the entity vocabulary as a number
of maximum orthogonal words, by regrouping the results obtained by deriving a root
word;
·
the frequencies for the essential words which
form a vocabulary;
·
the frequencies fi
for the totals Gi references, with i =1,2...,h;
·
the indicators used in
determining the quality of the references, such as:
o
the quantity of bibliography titles, introduced
without being quoted, Ipn, that it is calculated using the relation:
where
o
the quantity of the quoted titles, Ipc,
being calculated using the relation:
Ipc = 1- Ipn
o
the quantity of excessively quoted titles,
Ipex, calculated with the relation:
where
o
the list of excessively quoted titles, Gj1
Gj2 ... Gjx for which
;
o
the list of unquoted titles, Gj1
Gj2 ... Gjr for which fjs = 0, js = {j1,
j2, ... jr}.
Proportionality is a very important
characteristic by the sense and especially the effects which are determined during
each entity reference process.
As well as all the reality is formed out of objects, processes, phenomena,
beings characterized by structures formed out of components which interact, the
text entities, as a reflection of the reality, even if they are artificially constructions,
also consist of interacting components.
Proportionality is represented through the attention for the real
world analysis. To a complex subsystem, a subtext with a bigger length than the
length of a simple subsystem must correspond. Proportionality is represented by
the relation within the text entity, itself.
Intraorthogonality is the quality
characteristic that marks the differences among some text entity components.
The ET text entity having the tree structure from figure
14 is considered.
Fig. 14. Tree structure
corresponding to a text entity
The subtexts SCij are orthogonal between themselves if
and only if H(SCij;SCkl) ->1,
Intraorthogonality reveals how different the parts that form a text
entity are.
In case in which some concepts are treated again, the intraorthogonality
declines.
When at each subchapter level the concepts and concepts connections
graph are planned, the result consists in sub-graphs with nodes arranged on levels.
The nodes are connected only with simple arcs that connect nodes
from adjacent levels, as showed in figure 15.
Fig. 15. Graph structure
with arcs connecting adjacent nodes
Interorthogonality refers to the
entities differences and reveals the measure in which they differ as presentation
form or content.
Considering two text entities, they tend to be orthogonal, as they
do not treat similar concepts.
For example, the works elaborated in a specific field, are orthogonal
if their texts have in common only the words that are characteristic for the field
with the other words being very different as apparition frequency and position [IVAN05].