-
Notifications
You must be signed in to change notification settings - Fork 154
Expand file tree
/
Copy pathChatScript-Pattern-Redux.html
More file actions
773 lines (770 loc) · 34.5 KB
/
ChatScript-Pattern-Redux.html
File metadata and controls
773 lines (770 loc) · 34.5 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" lang="" xml:lang="">
<head>
<meta charset="utf-8" />
<meta name="generator" content="pandoc" />
<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" />
<title>ChatScript-Pattern-Redux</title>
<style>
html {
color: #1a1a1a;
background-color: #fdfdfd;
}
body {
margin: 0 auto;
max-width: 36em;
padding-left: 50px;
padding-right: 50px;
padding-top: 50px;
padding-bottom: 50px;
hyphens: auto;
overflow-wrap: break-word;
text-rendering: optimizeLegibility;
font-kerning: normal;
}
@media (max-width: 600px) {
body {
font-size: 0.9em;
padding: 12px;
}
h1 {
font-size: 1.8em;
}
}
@media print {
html {
background-color: white;
}
body {
background-color: transparent;
color: black;
font-size: 12pt;
}
p, h2, h3 {
orphans: 3;
widows: 3;
}
h2, h3, h4 {
page-break-after: avoid;
}
}
p {
margin: 1em 0;
}
a {
color: #1a1a1a;
}
a:visited {
color: #1a1a1a;
}
img {
max-width: 100%;
}
svg {
height; auto;
max-width: 100%;
}
h1, h2, h3, h4, h5, h6 {
margin-top: 1.4em;
}
h5, h6 {
font-size: 1em;
font-style: italic;
}
h6 {
font-weight: normal;
}
ol, ul {
padding-left: 1.7em;
margin-top: 1em;
}
li > ol, li > ul {
margin-top: 0;
}
blockquote {
margin: 1em 0 1em 1.7em;
padding-left: 1em;
border-left: 2px solid #e6e6e6;
color: #606060;
}
code {
font-family: Menlo, Monaco, Consolas, 'Lucida Console', monospace;
font-size: 85%;
margin: 0;
hyphens: manual;
}
pre {
margin: 1em 0;
overflow: auto;
}
pre code {
padding: 0;
overflow: visible;
overflow-wrap: normal;
}
.sourceCode {
background-color: transparent;
overflow: visible;
}
hr {
background-color: #1a1a1a;
border: none;
height: 1px;
margin: 1em 0;
}
table {
margin: 1em 0;
border-collapse: collapse;
width: 100%;
overflow-x: auto;
display: block;
font-variant-numeric: lining-nums tabular-nums;
}
table caption {
margin-bottom: 0.75em;
}
tbody {
margin-top: 0.5em;
border-top: 1px solid #1a1a1a;
border-bottom: 1px solid #1a1a1a;
}
th {
border-top: 1px solid #1a1a1a;
padding: 0.25em 0.5em 0.25em 0.5em;
}
td {
padding: 0.125em 0.5em 0.25em 0.5em;
}
header {
margin-bottom: 4em;
text-align: center;
}
#TOC li {
list-style: none;
}
#TOC ul {
padding-left: 1.3em;
}
#TOC > ul {
padding-left: 0;
}
#TOC a:not(:hover) {
text-decoration: none;
}
code{white-space: pre-wrap;}
span.smallcaps{font-variant: small-caps;}
div.columns{display: flex; gap: min(4vw, 1.5em);}
div.column{flex: auto; overflow-x: auto;}
div.hanging-indent{margin-left: 1.5em; text-indent: -1.5em;}
/* The extra [class] is a hack that increases specificity enough to
override a similar rule in reveal.js */
ul.task-list[class]{list-style: none;}
ul.task-list li input[type="checkbox"] {
font-size: inherit;
width: 0.8em;
margin: 0 0.8em 0.2em -1.6em;
vertical-align: middle;
}
.display.math{display: block; text-align: center; margin: 0.5rem auto;}
</style>
<!--[if lt IE 9]>
<script src="//cdnjs.cloudflare.com/ajax/libs/html5shiv/3.7.3/html5shiv-printshiv.min.js"></script>
<![endif]-->
</head>
<body>
<h1 id="chatscript-pattern-redux">ChatScript Pattern Redux</h1>
<p>Copyright Bruce Wilcox, mailto:[email protected]
www.brilligunderstanding.com<br />
<br>Revision 4/24/2022 cs12.1</p>
<p>Pattern matching information was introduced in the Beginner manual
and expanded in the <a
href="ChatScript-Advanced-User-Manual.html">Advanced User Manual</a>.
Since pattern matching is of such importance, this concise manual lists
everything about patterns in one place, including patterns not listed in
the Advanced manual.</p>
<p>NOTE: despite the extraordinary range of weird matching abilities,
almost all of my normal code is based on one of three patterns:</p>
<pre><code># rule 1
u: (![plastic] << bag trick >>)
# rule 2
u: (I * love * you)
# rule 3
t: What fruit do you like?
a: (~why)
a: (orange)
a: (apple)
a: (~vegetables)</code></pre>
<p>Rule 1 - searches for key words in any order. While there is a normal
order to questions, e.g., <em>where do you live</em>, one can ask
<em>you live where?</em> so handling arbitrary order is generally
valuable. Just have all the keywords you need to detect a meaning and
use <code>![...]</code> to get rid of interpretations you don’t
want.</p>
<p>Rule 2 - requires an order when both first person and second person
pronouns are involved, since order will matter.</p>
<p>Rule 3 - uses simple keywords or concept sets in rejoinders, since
the context of the gambit constrains the input so highly.</p>
<h2 id="if-patterns">IF Patterns</h2>
<p>Pattern matching can be done not just in a rule’s pattern component
but also in its output component, within an <code>if</code> statement,
e.g.:</p>
<pre><code>if ( PATTERN _~number ) { print( _0) }</code></pre>
<p>That is, if the first word in the test condition is the word
<code>PATTERN</code>, the rest is treated as a standard pattern of a
rule (not using <code>AND</code> <code>OR</code> etc). You can capture
data here or do anything a normal pattern does.</p>
<h2 id="pattern-position">Pattern Position</h2>
<p>A pattern consists of tokens. By default, any normal word in
canonical form can match any form of the word, so <em>he</em> in a
pattern can match <em>him</em>, <em>he</em>, <em>his</em>. A pattern
aborts when a token fails to match unless allowed to not match.</p>
<p>The performance cost of a pattern generally is linear in the number
of tokens processed. That means these two rules take the same time to
match (other than the imperceptable time difference to read the longer
token).</p>
<pre><code>u: (apple)
u: (~ten_thousand_names_of_fruits)</code></pre>
<p>The system tracks current position in the sentence as it matches.</p>
<p>The first token of a pattern is allowed to match anywhere in the
sentence. After that normally tokens are matched against words in
consecutive order in the input. If a pattern starts to match and then
fails, the system is allowed to retry matching later in the sentence
once. It does this by freeing up the first matching word/concept token
and letting it rebind later.</p>
<p>Given this rule:</p>
<pre><code>u: ( I like apple )</code></pre>
<p>The input <em>Do you know that I like oranges and I like apples</em>
would match as follows.</p>
<p>The first pattern token <code>I</code> would match the first
<em>I</em> partway into the sentence (because it is allowed to match
anywhere).</p>
<p>The next pattern token <code>like</code> is required to match the
next input word in the sentence, which it does.</p>
<p>The third pattern word <em>apple</em> fails to match
<em>oranges</em>. We just failed. But we have one retry left.</p>
<p>So <em>I</em> is sought deeper in the sentence and matched.
<code>like</code> matches <em>like</em> and <code>apple</code> matches
<em>apples</em>. So we match.</p>
<p>Had that not matched, no more retries exist so the failure sticks.
There are tokens you can use that alter the rules/location around
current position.</p>
<h2 id="pattern-constituents">Pattern Constituents</h2>
<h3 id="type-of-sentence-s">Type of Sentence <code>s:</code>
<code>?:</code></h3>
<p>A responder beginning with <code>s:</code> or <code>?:</code>
implictly is testing that the sentence is a statement or a question. It
is built in even before the pattern. All other rules are not immediately
sensitive to kind of sentence.</p>
<h3 id="existence---word-concept-var-sysvar-_0-0-var">Existence - word
<code>~concept</code> <code>$var</code> <code>%sysvar</code>
<code>_0</code> <code>@0</code> <code>^var</code> <code>?</code>
<code>~</code></h3>
<p>Basic pattern matching is against words or concepts. Does this word
or concept exist?</p>
<pre><code>u: ( this ~animal )</code></pre>
<p>matches <em>this dog</em> or <em>this dogs</em> but not <em>this is
my dog</em></p>
<p>You can also ask if a user variable is defined just by naming it:</p>
<pre><code>u: ( $myvar help )</code></pre>
<p>this only matches if input has <em>help</em> and <code>$myvar</code>
is not null.</p>
<h4 id="system-variables">System variables</h4>
<p>one would not ask if they are defined (they almost always are) but
would use in a relation instead.</p>
<p>Similarly, <code>_0</code> by itself in a pattern means is it
defined, that is, not null.</p>
<pre><code>u: ( _{apple orange} _0 )</code></pre>
<p>matches only if apple or orange got matched. And <code>@0</code> by
itself means does this fact-set have any facts stored in it.</p>
<p>You can also reference an argument to a function call, and its value
will be used to decide what to do.</p>
<h4 id="stand-alone">Stand-alone <code>?</code></h4>
<p>A stand-alone <code>?</code> means is this sentence a question.</p>
<h4 id="stand-alone-1">Stand-alone <code>!?</code></h4>
<p><code>!?</code> would test if it is not a question.</p>
<h4 id="stand-alone-2">Stand-alone <code>~</code></h4>
<p>A stand-alone <code>~</code> means the current topic is already on
the pending topic list (was recently considered an active topic).</p>
<h3 id="grouping-pairs">Grouping Pairs <code>(</code> <code>)</code>
<code>[</code> <code>]</code> <code>{</code> <code>}</code>
<code><<</code> <code>>></code></h3>
<h4 id="parens">Parens <code>(</code> … <code>)</code></h4>
<p>Parens mean the tokens within must be found “in sequence”. The
notation of a pattern starts with parens, but has the unusual property
of allowing the match to occur anywhere within the sentence, not just at
the start. Any nested parens do not have that property, and still
require in sequence.</p>
<pre><code>u: ( this (is my) pattern)</code></pre>
<p>matches <em>this is my pattern</em> and not <em>this sometimes is my
pattern</em></p>
<h4 id="brackets">Brackets <code>[</code> … <code>]</code></h4>
<p>Brackets mean match one of contained tokens, in the order given. A
bracket list tries all its members in sequence, stopping when it finds a
match. For the input <em>I go home for Christmas</em> this will not
match:</p>
<pre><code>u: ( [~noun ~verb] * home )</code></pre>
<p>because <code>~noun</code> will match to <em>home</em> and then
<em>home</em> cannot be found later. On a retry, <code>~noun</code> will
match to <em>Christmas</em>. Since <code>~noun</code> can match multiple
times, <code>~verb</code> never gets tried.</p>
<p>You can composite things like:</p>
<pre><code>u: ( [ apple pear (favorite fruit) cherry ] )</code></pre>
<p>to match <em>I eat pear</em> and <em>my favorite fruit</em>, but this
form is unlikely to be used in normal CS.</p>
<p>Note that <code>[ … ]</code> and <code>~concept</code> are similar
but different in important ways. Matching <code>~concept</code> is
faster than the corresponding list inside <code>[]</code> because naming
the concept only requires one token. But it takes more memory to store
the concept than it does to put the words inside the
<code>[]</code>.</p>
<p>The other fundamental difference is in position. Words in
<code>[]</code> are matched in the order given, possibly moving your
position mark deep into the sentence.</p>
<p>Words in a concept are all matched simultaneously,<br />
so which one is found first in the sentence is what sets the position.
For an input <em>I like beer but not wine</em></p>
<pre><code>u: ( I like * ~drinks )</code></pre>
<p>this would match beer if beer and wine are in that concept in any
order.</p>
<pre><code>u: ( I like * [wine beer] )</code></pre>
<p>this would match wine even though it is farther in the sentence.</p>
<h4 id="braces">Braces <code>{</code> … <code>}</code></h4>
<p>Braces means match one of the contained tokens if you can, but don’t
fail if you don’t.</p>
<p>Using <code>{}</code> inside of angles is pointless (unless you put
an underscore in front to memorize something) because it makes no
difference to matching whether or not you had the <code>{}</code>
content.</p>
<p>Braces do not align position within a sentence. They are normally
used to assist in positional alignment by swallowing words.</p>
<pre><code>u: ( I go to {the} market )</code></pre>
<p>matches both <em>I go to the market</em> and <em>I go to
market</em>.</p>
<p>If you use underscore before braces to memorize the answer found,
then when no answer is found the match variable is set to
<code>null</code> (no content) but it is set.</p>
<h4 id="angles">Angles <code><<</code> …
<code>>></code></h4>
<p>Angles mean match all of the contained tokens in any order.</p>
<p>Putting <code>*</code> in this kind of pattern is illegal because it
has no meaning.</p>
<p>Position is not relevant anyway. Position is freely reset to the
start following this sequence so if you had the pattern:</p>
<pre><code>u: ( I * like << really >> photos )</code></pre>
<p>and input <em>photos I really like</em> then it would match because
it found <em>I * like</em> then found anywhere <code>really</code> and
then reset the position freely back to start and found
<code>photos</code> anywhere in the sentence.</p>
<h3 id="wildcards-2-3--2-2b">Wildcards <code>*</code> <code>*~2</code>
<code>*3</code> <code>*-2</code> <code>*~2b</code></h3>
<p>Wildcards allow you to relax the positional requirements for
matching. The classic wildcard <code>*</code>allows you to have zero or
more words between other tokens in a pattern.</p>
<pre><code>u: ( I * you )</code></pre>
<p>matches <em>I love chicken and hate you</em> as well as <em>I you
they</em>.</p>
<p>You can limit the unlimited range by adding <code>~n</code> after it.
So <code>*~1</code> means 0 or 1 words may intervene.</p>
<p><code>*~2</code> is what I commonly use to restrict a range. This
allows a determiner and an adjective to fit before a noun, for example,
but not allow a pattern to match weirdly.</p>
<pre><code>u: ( I like *~2 cat )</code></pre>
<p>matches <em>I like my cats</em> or <em>I like a yellow cat</em>.</p>
<p><code>*~2b</code> is similar to <code>*~2</code> except it tries to
match bidirectionally. First it tries to match behind it, and if that
fails, it tries forward (like *~2). You may not follow a bidirectional
wildcard with either <code>{</code> or <code>(</code>.</p>
<p>You can also request a match of a specific number of words in
succession using <code>*n</code>. <code>*1</code> means get the next
word. If you are already positionally on the end of the sentence, this
match fails.</p>
<p>If you aren’t sure how many words are left, you could do something
like this:</p>
<pre><code>u: ( apple _[*4 *3 *2 *1] )</code></pre>
<p>which will grab the next 4 or 3 or 2 or 1 words, depending on how
many are available.</p>
<p>Generally done with an underscore in front to memorize the
sequence.</p>
<p><code>*-2</code> is like <code>*2</code>, only it matchs backwards
instead of forwards. Valid thru <code>*-9</code>.</p>
<h3 id="negation-and-and--">Negation <code>!</code> and <code>!!</code>
and ‘!-’</h3>
<p><code>!x</code> means match only if x is not found anywhere in the
sentence later than where we are:</p>
<pre><code>u: ( !not I love you )</code></pre>
<p>This pattern says the word not cannot occur anywhere in the
sentence.</p>
<p><code>!!x</code> means match only if x is not the next word.</p>
<p>This pattern says the word not cannot occur anywhere before us in the
sentence.</p>
<p><code>!-x</code> means match only if x is not any prior word.</p>
<h3 id="original-form">Original Form <code>'</code></h3>
<p>While CS normally matches both original and canonical forms of words
when you give a pattern word in canonical form, you can require it only
match the original form by quoting it.</p>
<pre><code>u: ( I * 'take it ) </code></pre>
<p>does not match <em>I am taking it</em></p>
<p>Likewise in a relation where you use a match variable, quoting it
means use only its original value.</p>
<pre><code>u: ( _~fruits '_0==apple )</code></pre>
<p>matches <em>I like apple</em> but not <em>I like apples</em></p>
<h3 id="literal-next">Literal Next <code>\</code></h3>
<p>You can tell CS that a token should be considered a token, not a
special form, by putting a <code>\</code> in front of it.</p>
<p>This applies to single characters like: <code>\[ \]</code> and it
also applies to relational tokens like <code>\tom=*</code> which means
do not treat this as a relational test, but instead as a token whose
name is wildcarded.</p>
<p>Note that the <code>\</code> does not suppress detecting the
<code>*</code> in a word and therefore allowing variant spelling.</p>
<h3 id="composite-words-my-composite-word">Composite Words “my composite
word”</h3>
<p>There are sequences of words that have a specific meaning and are
treated as a single word, e.g., <em>batting cage</em>.</p>
<p>In a dictionary these are often represented using an <code>_</code>
instead of a space, e.g., <em>batting_cage</em>.</p>
<p>When CS tokenized your input, it automatically converts your
separated input words into ones with underscores in them when
appropriate.</p>
<p>They are no longer single words, but instead a single composite word.
This would normally mean that</p>
<pre><code>u: ( batting cage )</code></pre>
<p>would not match. But the script compiler does the same tokenization
thing, so your actual internal pattern looks like:</p>
<pre><code>u: ( batting_cage )</code></pre>
<p>For clarity, it is recommended that when you know you are dealing
with a composite word, you use the underscore notation.</p>
<p>Sequences of words can also be designated using double quotes.</p>
<pre><code>u: ( "batting cage" )</code></pre>
<p>CS converts a quoted string into the same underscore notation. The
distinction between the two is generally one of documentation.</p>
<p>I use quoted strings for phrases to highlight the intention that they
are a phrase. I also use them for multiple word proper names like
<em>Eiffel Tower</em>.</p>
<p>It is particularly important to use the quoted notation when
punctuation is embedded in the name like <em>John’s Apple Pie</em>
because knowing where to put underscores when punctuation is involved is
tricky.</p>
<p>By using quotes, you tell the system to manage things appropriately
(<code>John_'s_Apple_Pie</code>)</p>
<p>When using the quoted notation, the system will actually try to match
original and canonical, just like with ordinary words. If all words in
phrase are canonical, the system will match any form of each word.</p>
<p>If one is not canonical, it can only match the original form.</p>
<pre><code>u: ( "king of the jungle" )</code></pre>
<p>cannot match <em>kings of the jungle</em> because <em>the</em> in
pattern is not canonical.</p>
<pre><code>u: ( "king of a jungle" )</code></pre>
<p>but the above rule can match <em>kings of the jungle</em> since all
words in the quote are canonical.</p>
<h3 id="memorization-_">Memorization <code>_</code></h3>
<p>Placing an underscore means to memorize what was matched onto a match
variable. Match variables are allocated in sequence in a pattern,
starting with <code>_0</code> and increasing to <code>_1</code> etc for
each memorized match.</p>
<p>The system memorizes the original word, the canonical word, and the
position in the sentence of the match.</p>
<h3 id="relations">Relations <code>></code> <code><</code>
<code>?</code> <code>==</code> <code>!=</code> <code><=</code>
<code>>=</code></h3>
<p>You can test relationships by conjoining a token with a relationship
operator and another token, with no spaces. E.g.,</p>
<pre><code>u: ( I am _~number > _0>18 ) You are of legal age.</code></pre>
<p>The relationship operators are:</p>
<table>
<colgroup>
<col style="width: 53%" />
<col style="width: 46%" />
</colgroup>
<thead>
<tr class="header">
<th style="text-align: center;">operator</th>
<th>meaning</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td style="text-align: center;"><code>==</code></td>
<td>equal</td>
</tr>
<tr class="even">
<td style="text-align: center;"><code>!=</code></td>
<td>not equal</td>
</tr>
<tr class="odd">
<td style="text-align: center;"><code><</code></td>
<td>less than</td>
</tr>
<tr class="even">
<td style="text-align: center;"><code><=</code></td>
<td>less than or equal to</td>
</tr>
<tr class="odd">
<td style="text-align: center;"><code>></code></td>
<td>greater than</td>
</tr>
<tr class="even">
<td style="text-align: center;"><code>>=</code></td>
<td>greater than or equal to</td>
</tr>
<tr class="odd">
<td style="text-align: center;"><code>&</code></td>
<td>bit anded results in non-zero</td>
</tr>
<tr class="even">
<td style="text-align: center;"><code>?</code></td>
<td>is member of 2nd arg concept or topic or JSON array. If no argument
occurs after, means is value found in sentence</td>
</tr>
</tbody>
</table>
<p>Using a compare with two text strings (not numbers) will evaluate
based on case-independent alpha sorting.</p>
<p>For comparison against a number (< <= > >=) a null value
will be treated as the number 0.</p>
<p>The <code>?</code> operator has two forms. <code>xxx?~yyy</code> will
look for actual membership in the set whereas <code>_n?~yyy</code> will
only see if the location of match detection of <code>_n</code> is the
same as a corresponding match location for the concept. If the concept
has not been marked, then obviously no match is found.</p>
<p>Note: You can put <code>!</code> before the tokens instead of using
<code>!=</code> and <code>!?</code>. E.g.,</p>
<pre><code>u: ( _~noun !_0?~fruit ) if the noun is not in fruit concept</code></pre>
<h4 id="dynamic-matching">Dynamic matching</h4>
<p>The stand-alone <code>?</code> is used with variables for dynamic
matching.</p>
<p>While you cannot do memorization in front of a comparison (because no
positional data is gained) you can in front of the <code>?</code>
operator since finding where in the sentence something is will return a
position for memorization.</p>
<pre><code>u: ( $var? ) </code></pre>
<p>means is the value of <code>$var</code> found in the sentence
anywhere</p>
<p>Note that when <code>$var</code> is a normal word, that is simple for
CS to handle.</p>
<p>If <code>$var</code> is a phrase, then generally CS cannot match it.
This is because for phrases, CS needs to know in advance that a phrase
can be matched.</p>
<p>If you put <em>take a seat</em> as a keyword in a concept or topic or
pattern, that phrase is stored in the dictionary and marked as a pattern
phrase, meaning if the phrase is ever seen in a sentence, it should be
noticed and marked so it can be matched in a pattern.</p>
<p>But if it is merely in a variable, then the dictionary is unaware of
the phrase and so <code>$var?</code> will not work for it.</p>
<p>There is also a <code>?$var</code> form, which means see if the value
of the variable is findable. The value can be either a word or a concept
name.</p>
<h3 id="assignment-in-a-pattern">Assignment in a pattern</h3>
<p>You can directly assign to any variable from any other value using
<code>:=</code>. You can even do arithmetic for these assignments (:+=
:-= “*= :/= :&= and any of the other numeric assignment operators)
.</p>
<pre><code> $value = 5
( _some_test $value:=5 $value1:=_0 $value2:='_0 $value3:=%time )
( _some_test $value:+=5 $value1:-=_0 )</code></pre>
<p>If you want to do function call assignment, you can do this:</p>
<pre><code> $value:=^"^function(foo d)"</code></pre>
<p>The reason you have to do an active string here, is because normally
spaces break apart tokens, and a pattern token involving a function
needs to have all arguments part of the same token. Hence assigning from
an active string, where the double quotes around it prevents the token
from breaking apart.</p>
<h3 id="escape">Escape <code>\</code></h3>
<p>If you want to match a reserved punctuation symbol like
<code>[</code> or <code>(</code>, you must escape it by putting a
backslash in front. This is commonly done in matching out-of-band
information.</p>
<pre><code>u: ( < \[ * \] ) ^respond(~determine_oob)</code></pre>
<p>One also uses escape if you want to know if the sentence was
punctuated with an exclamation.</p>
<pre><code>s: ( \! )</code></pre>
<p>means user did something like <em>I love you!</em>.</p>
<p>You may use either <code>?</code> or <code>\?</code> when asking if
the sentence has a question in it. You would generally only do this in a
rejoinder.</p>
<h2 id="concept-intersection-keywords">Concept intersection
keywords</h2>
<p>If you join a word (or a concept) and one or more concepts, that
represents the intersection of them. e.g., (<sub>animals</sub>tasty)
will reference all animals considered tasty.</p>
<p>Note, you cannot use word~1 (meaning specification) or word~n
(pos-tag specification) on your first word.</p>
<h3 id="function-call---xxx...">Function Call -
<code>^xxx(...)</code></h3>
<p>You can call a function from within a pattern. If the function
returns a failure code of any kind, the match fails. If the function is
a predefined system function, you are allowed relation operators on the
result as well.</p>
<pre><code>u: ( ^lastused(~ gambit)>5 )</code></pre>
<p>NOTE:<br />
User defined functions (<code>patternmacros</code>) do not allow
relational operators after them.</p>
<p>Patternmacros do not generate answers. They are treated as in-line
additional pattern tokens.</p>
<pre><code>Patternmacro: ^testuse(^value) _~noun _0==^value
u: ( ~noun ^testuse(apple)) # matches "I like pear and apple"</code></pre>
<p>A powerful use of function calling is to call
<code>^respond(~topicname)</code> in a pattern. The topic can match
something and set up a variable for further guidance. E.g.,</p>
<pre><code>u: ( ^respond(~finddelay) $$delay ) Wait for $$delay.</code></pre>
<p><code>~finddelay</code> can hunt for time referred to in seconds,
minutes, hours, etc, or in words like next week or tomorrow or whatever
complex matching you want to do.</p>
<h3 id="partially-spelled-words-ing-bottle-8bott">Partially Spelled
words: <code>*ing</code> <code>bottle*</code> <code>8bott*</code></h3>
<p>You can request a match against a partial spelling of an original
word (not its canonical alternative) in various ways.</p>
<p>If you use <code>*</code> somewhere after an alpha, it matches any
number of characters.</p>
<pre><code>u: ( sag*us ) </code></pre>
<p>matches many misspellings of <em>sagittarius</em>.</p>
<p>If you use <code>*</code> followed by an alpha, you get anything as a
prefix followed by what you request.</p>
<pre><code>u: ( *tha ) </code></pre>
<p>matches <em>Martha</em>.</p>
<p>If you put a number in front, it means the word must be exactly that
many characters long, matching your pattern.</p>
<pre><code>u: ( 6sit* )</code></pre>
<p>matches <em>sitter</em>.</p>
<p>When using an <code>*</code> word, you can use <code>.</code> to
indicate exactly one character of any value.</p>
<pre><code>u: ( sit*u.tion ) </code></pre>
<p>matches <em>situation</em>.</p>
<h3 id="altering-position-_0-_0--_0">Altering Position <code><</code>
<code>></code> <code>@_0+</code> <code>@_0-</code>
<code>@_0</code></h3>
<p>When you put <code><</code> in your pattern, it doesn’t actually
match anything. It means “reset position” to the start of the
sentence.</p>
<pre><code>u: ( < I love )</code></pre>
<p>matches <em>I love</em> but not <em>do I love</em>.</p>
<p>When you put <code>></code> in your pattern, it does not alter
your position, but it tries to confirm you are on the last word of the
sentence.</p>
<pre><code>u: ( I * > )</code></pre>
<p>in this pattern <code>></code> is redundant, since <code>*</code>
would match to the end of the sentence anyway.</p>
<p>You may also use <code>!></code> to ask that we NOT be at the end
of the sentence.</p>
<p><code>@_1+</code> says to set the position to where the given match
variable (<code>_1</code>) matched. Positional sequencing will continue
normally increasing thereafter.</p>
<p>You can suffix the match variable with <code>-</code> instead, to
tell CS to begin matching in reverse order in the sentence, i.e.,
matching backwards to the start of the sentence.</p>
<p>When you use <code>+</code>, the position starts at the end of the
match. When you use <code>-</code>, the position starts at the start of
the match.</p>
<pre><code>u: ( _home is @_0- pretty )</code></pre>
<p>matches <em>my pretty home is near here</em>.</p>
<p>Note when you use <code>-</code> for reverse matching, the behavior
of <code><</code> and <code>></code> changes. <code>></code>
sets a position and <code><</code> confirms it instead of the way it
is for <code>+</code>.</p>
<p>When you omit either + or -, you create a matchable anchor like
<code>@_0</code>. It represents what was found at that position, and
during the pattern must also match at that location now.</p>
<pre><code>u: ( _@0 is @_1 )</code></pre>
<p>The above pattern says that the word <code>is</code> must be
precisely found between the locations referenced by <code>@0</code> and
<code>@1</code>.</p>
<h3 id="retrying-scan-retry">Retrying scan <code>@retry</code></h3>
<p>Normally one matches a pattern, performs the output code, and if you
want to restart the pattern to find the next occurrence of a match, you
use ^retry(RULE) or ^retry(TOPRULE). Well, if your pattern executes
<span class="citation" data-cites="retry">@retry</span> as a token, it
will retry on its own without needing to execute any output code. Useful
in conjunction with ^testpattern.</p>
<h2 id="debugging">Debugging</h2>
<h3 id="testpattern"><code>:testpattern</code></h3>
<p>The system inputs the sentence and tests the pattern you provide
against it. It tells you whether it matched or failed.</p>
<pre><code>:testpattern ( it died ) Do you know if it died?</code></pre>
<p>Some patterns require variables to be set up certain ways. You can
perform assignments prior to the sentence.</p>
<pre><code>:testpattern ($gender=male hit) $gender = male hit me</code></pre>
<p>Typically you might use <code>:testpattern</code> to see if a subset
of your pattern that fails works, trying to identify what has gone
wrong. You can also name an existing rule, rather than supply a
pattern.</p>
<pre><code>:testpattern ~aliens.ufo do you believe in aliens?</code></pre>
<h3 id="prepare"><code>:prepare</code></h3>
<p>Since CS may revise your input for various reasons, to know why a
pattern fails you may need to know what actually say.</p>
<p>Using <code>:prepare</code> will tell you what the final input words
were, and what concepts got marked.</p>
<pre><code>:prepare This is a sentence.</code></pre>
<h3 id="verify"><code>:verify</code></h3>
<p>In general all of your responders and rejoinders should have a sample
input comment above them.</p>
<pre><code>#! Do you believe in dogs?
?: ( << you believe dog >>) I do.</code></pre>
<p>This allows you to do</p>
<pre><code>:verify ~mytopic pattern</code></pre>
<p>and have the system test if your rule would match your input.</p>
<h3 id="trace"><code>:trace</code></h3>
<p>You can get a trace of various system functions.</p>
<pre><code>:trace pattern</code></pre>
<p>will show you pattern matching and match variable binding. Also
useful if done before <code>:testpattern</code>.</p>
<h2 id="overrulingsupplementing-cs-matching">Overruling/Supplementing CS
Matching</h2>
<p>Sometimes you want to supplement the marking of concepts done by
adding your own marks. This is particularly useful handling idioms where
no keyword exists. I set <code>$cs_prepass</code> to be a topic which
looks for idioms.</p>
<pre><code>?: ( < what do you do > ) ^mark(~work)</code></pre>
<p>This will cause the work topic to react later as though one of its
keywords was given.</p>
<p>Likewise sometimes you want to disable some marking. For example,
<em>chocolate</em> is both a flavor and a color. To avoid going to the
color topic incorrectly I might do this:</p>
<pre><code>u: ( << _chocolate [taste eat] >> ) ^unmark(~colorTopic _0)</code></pre>
<p>If the above rule detects <em>chocolate</em> in the apparent context
of eating, it will unmark any reference to <code>~colortopic</code>
found at the location of the word <em>chocolate</em>.</p>
<h2 id="graduation-exercise">Graduation Exercise</h2>
<p>The pattern matching system of ChatScript has esoteric abilities. I
was asked if I would implement an additional one that would look
something like this:</p>
<pre><code><< green nose mucus >>~3</code></pre>
<p>which he wanted to mean: find all those words, in any order, with
each word after the first within a gap range of 3 from the previous
word.</p>
<p>So it could recognize: <em>green is the nose with my mucus</em> or
<em>my nose puts forth green mucus</em> and not match <em>while green is
my favorite color</em> or <em>I don’t want to see it in my nose
mucus</em>.</p>
<p>I replied it could probably already be done in CS as it stood, and a
few minutes later had whipped up code to do that. Your advanced
challenge, if you care to think about it and really warp your mind, is
to think of a way to do it yourself. That will prove you really
understand what can be done in pattern matching. Answer is on the next
page.</p>
<pre><code>patternmacro: ^nearbyword(^word)
[
(@_0+ *~3 _^word ^eval(_0 = _1))
(@_0- *~3 _^word ^eval(_0 = _1))
]</code></pre>
<p>The macro contains two choices, a sequence that looks forward from
where you are and a sequence that looks backwards. Using a nested
<code>()</code> the system will effectively treat that as a single
match, which makes it a single token to be used in a <code>[]</code>
choice.</p>
<p>Whichever <code>()</code> finds the next word, it memorizes where it
is, then sets the current <code>_0</code> location to the new word and
the choice ends.</p>
<p>While you can’t do code execution directly in a pattern, and you
can’t call out to user-defined outputmacros, you can call out to system
functions, and <code>^eval</code> lets you do any amount of normal code
execution. So this allows us to assign the new match variable to the
old. And assigning match variables means assigning all of their
attributes, including original value, canonical value, and actual
position data.</p>
<pre><code>topic: ~test()
u: ( _green ^nearbyword(nose) ^nearbyword(mucus)) You have a disgusting nose.</code></pre>
<p>The test pattern therefore, finds the first word and sets the current
<code>_0</code> location. Then it uses the macro to find the next word
and change location, and then the next word.</p>
</body>
</html>