forked from gpoore/pythontex
-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathpythontex3.py
More file actions
2136 lines (2002 loc) · 104 KB
/
pythontex3.py
File metadata and controls
2136 lines (2002 loc) · 104 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
#!/usr/bin/env python
# -*- coding: utf-8 -*-
'''
This is the main PythonTeX script.
Two versions of this script and the other PythonTeX scripts are provided.
One set of scripts, with names ending in "2", runs under Python 2.7. The
other set of scripts, with names ending in "3", runs under Python 3.1 or
later.
This script needs to be able to import pythontex_types*.py; in general it
should be in the same directory. This script creates scripts that need to
be able to import pythontex_utils*.py. The location of that file is
determined via the kpsewhich command, which is part of the Kpathsea library
included with some TeX distributions, including TeX Live and MiKTeX.
Licensed under the BSD 3-Clause License:
Copyright (c) 2012-2013, Geoffrey M. Poore
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
* Neither the name of the <organization> nor the
names of its contributors may be used to endorse or promote products
derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL <COPYRIGHT HOLDER> BE LIABLE FOR ANY
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
'''
# Imports
#// Python 2
#from __future__ import absolute_import
#from __future__ import division
#from __future__ import print_function
#from __future__ import unicode_literals
#\\ End Python 2
import sys
import os
import argparse
import codecs
from hashlib import sha1
from collections import defaultdict
from re import match, sub, search
import subprocess
import multiprocessing
from pygments import highlight
from pygments.lexers import get_lexer_by_name
from pygments.formatters import LatexFormatter
#// Python 2
#from pythontex_types2 import *
#try:
# import cPickle as pickle
#except:
# import pickle
#from io import open
#\\ End Python 2
#// Python 3
from pythontex_types3 import *
import pickle
#\\ End Python 3
# Script parameters
# Version
version = '0.10beta2'
def process_argv(data, temp_data):
'''
Process command line options using the argparse module.
Most options are passed via the file of code, rather than via the command
line.
'''
# Create a command line argument parser
parser = argparse.ArgumentParser()
parser.add_argument('TEXNAME',
help='LaTeX file, with or without .tex extension')
parser.add_argument('--version', action='version',
version='PythonTeX {0}'.format(data['version']))
parser.add_argument('--encoding', default='utf-8',
help='encoding for all text files (see codecs module for encodings)')
parser.add_argument('--error-exit-code', default='true',
choices=('true', 'false'),
help='return exit code of 1 if there are errors (not desirable with some TeX editors and workflows)')
group = parser.add_mutually_exclusive_group()
group.add_argument('--runall', nargs='?', default='false',
const='true', choices=('true', 'false'),
help='run all code, regardless of whether it has been modified; equivalent to package option')
group.add_argument('--rerun', default='errors',
choices=('modified', 'errors', 'warnings', 'all'),
help='set conditions for rerunning code; equivalent to package option')
parser.add_argument('--hashdependencies', nargs='?', default='false',
const='true', choices=('true', 'false'),
help='hash dependencies (such as external data) to check for modification, rather than using mtime; equivalent to package option')
parser.add_argument('-v', '--verbose', default=False, action='store_true',
help='verbose output')
args = parser.parse_args()
# Store the parsed argv in data and temp_data
data['encoding'] = args.encoding
if args.error_exit_code == 'true':
temp_data['error_exit_code'] = True
else:
temp_data['error_exit_code'] = False
# runall is a subset of rerun, so both are stored under rerun
if args.runall == 'true':
temp_data['rerun'] = 'all'
else:
temp_data['rerun'] = args.rerun
if args.hashdependencies == 'true':
temp_data['hashdependencies'] = True
else:
temp_data['hashdependencies'] = False
temp_data['verbose'] = args.verbose
if args.TEXNAME is not None:
# Determine if we a dealing with a raw basename or filename, or a
# path to it. If there's a path, we need to make the document
# directory the current working directory.
dir, raw_jobname = os.path.split(args.TEXNAME)
dir = os.path.expanduser(os.path.normcase(dir))
if len(dir) > 0:
os.chdir(dir)
# If necessary, strip off an extension to find the raw jobname that
# corresponds to the .pytxcode.
if not os.path.exists(raw_jobname + '.pytxcode'):
raw_jobname = raw_jobname.rsplit('.', 1)[0]
if not os.path.exists(raw_jobname + '.pytxcode'):
print('* PythonTeX error')
print(' Code file ' + raw_jobname + '.pytxcode does not exist.')
print(' Run LaTeX to create it.')
return sys.exit(1)
# We need a "sanitized" version of the jobname, with spaces and
# asterisks replaced with hyphens. This is done to avoid TeX issues
# with spaces in file names, paralleling the approach taken in
# pythontex.sty. From now on, we will use the sanitized version every
# time we create a file that contains the jobname string. The raw
# version will only be used in reference to pre-existing files created
# on the TeX side, such as the .pytxcode file.
jobname = raw_jobname.replace(' ', '-').replace('"', '').replace('*', '-')
# Store the results in data
data['raw_jobname'] = raw_jobname
data['jobname'] = jobname
# We need to check to make sure that the "sanitized" jobname doesn't
# lead to a collision with a file that already has that name, so that
# two files attempt to use the same PythonTeX folder.
#
# If <jobname>.<ext> and <raw_jobname>.<ext> both exist, where <ext>
# is a common LaTeX extension, we exit. We operate under the
# assumption that there should only be a single file <jobname> in the
# document root directory that has a common LaTeX extension. That
# could be false, but if so, the user probably has worse things to
# worry about than a potential PythonTeX output collision.
# If <jobname>* and <raw_jobname>* both exist, we issue a warning but
# attempt to proceed.
if jobname != raw_jobname:
resolved = False
for ext in ('.tex', '.ltx', '.dtx'):
if os.path.isfile(raw_jobname + ext):
if os.path.isfile(jobname + ext):
print('* PythonTeX error')
print(' Directory naming collision between the following files:')
print(' ' + raw_jobname + ext)
print(' ' + jobname + ext)
return sys.exit(1)
resolved = True
break
if not resolved:
ls = os.listdir('.')
for file in ls:
if file.startswith(jobname):
print('* PythonTeX warning')
print(' Potential directory naming collision between the following names:')
print(' ' + raw_jobname)
print(' ' + jobname + '*')
print(' Attempting to proceed.')
temp_data['warnings'] += 1
break
def load_code_get_settings(data, temp_data):
'''
Load the code file, process the settings its contains, and remove the
settings lines so that the remainder is ready for code processing.
'''
# Bring in the .pytxcode file as a list
raw_jobname = data['raw_jobname']
encoding = data['encoding']
if os.path.isfile(raw_jobname + '.pytxcode'):
f = open(raw_jobname + '.pytxcode', 'r', encoding=encoding)
pytxcode = f.readlines()
f.close()
else:
print('* PythonTeX error')
print(' Code file ' + raw_jobname + '.pytxcode does not exist.')
print(' Run LaTeX to create it.')
return sys.exit(1)
# Determine the number of settings lines in the code file.
# Create a list of settings, and save the code for later processing.
n = len(pytxcode) - 1
while pytxcode[n].startswith('=>PYTHONTEX:SETTINGS#'):
n -= 1
pytxsettings = pytxcode[n+1:]
temp_data['pytxcode'] = pytxcode[:n+1]
# Prepare to process settings
#
# Create a dict for storing settings.
settings = dict()
# Create a dict for storing Pygments settings.
# Each dict entry will itself be a dict.
pygments_settings = defaultdict(dict)
# Create a dict of processing functions, and generic processing functions
settingsfunc = dict()
def set_kv_data(k, v):
if v == 'true':
settings[k] = True
elif v == 'false':
settings[k] = False
else:
settings[k] = v
# Need a function for when assignment is only needed if not default value
def set_kv_temp_data_if_not_default():
def f(k, v):
if v != 'default':
if v == 'true':
temp_data[k] = True
elif v == 'false':
temp_data[k] = False
else:
temp_data[k] = v
return f
def set_kv_data_fvextfile(k, v):
# Error checking on TeX side should be enough, but be careful anyway
try:
v = int(v)
except ValueError:
print('* PythonTeX error')
print(' Unable to parse package option fvextfile.')
return sys.exit(1)
if v < 0:
settings[k] = sys.maxsize
elif v == 0:
settings[k] = 1
print('* PythonTeX warning')
print(' Invalid value for package option fvextfile.')
temp_data['warnings'] += 1
else:
settings[k] = v
def set_kv_pygments_global(k, v):
# Global pygments optins use a key that can't conflict with anything
# (inputtype can't ever contain a hash symbol). This key is deleted
# later, as soon as it is no longer needed. Note that no global
# lexer can be specified via pygopt; pyglexer is needed for that.
if k == 'pyglexer':
if v != '':
pygments_settings['#GLOBAL']['lexer'] = v
elif k == 'pygopt':
options = v.strip('{}').replace(' ', '').split(',')
# Set default values, modify based on settings
style = None
texcomments = None
mathescape = None
for option in options:
if option.startswith('style='):
style = option.split('=', 1)[1]
elif option == 'texcomments':
texcomments = True
elif option.startswith('texcomments='):
option = option.split('=', 1)[1]
if option in ('true', 'True'):
texcomments = True
elif option == 'mathescape':
mathescape = True
elif option.startswith('mathescape='):
option = option.split('=', 1)[1]
if option in ('true', 'True'):
mathescape = True
elif option != '':
print('* PythonTeX warning')
print(' Unknown global Pygments option: ' + option)
temp_data['warnings'] += 1
if style is not None:
pygments_settings['#GLOBAL']['style'] = style
pygments_settings['#GLOBAL']['commandprefix'] = 'PYG' + style
if texcomments is not None:
pygments_settings['#GLOBAL']['texcomments'] = texcomments
if mathescape is not None:
pygments_settings['#GLOBAL']['mathescape'] = mathescape
def set_kv_pygments_family(k, v):
inputtype, lexer, options = v.replace(' ','').split(',', 2)
options = options.strip('{}').split(',')
# Set default values, modify based on settings
style = 'default'
texcomments = False
mathescape = False
for option in options:
if option.startswith('style='):
style = option.split('=', 1)[1]
elif option == 'texcomments':
texcomments = True
elif option.startswith('texcomments='):
option = option.split('=', 1)[1]
if option in ('true', 'True'):
texcomments = True
elif option == 'mathescape':
mathescape = True
elif option.startswith('mathescape='):
option = option.split('=', 1)[1]
if option in ('true', 'True'):
mathescape = True
elif option != '':
print('* PythonTeX warning')
print(' Unknown Pygments option for ' + inputtype + ': ' + '"' + option + '"')
pygments_settings[inputtype] = {'lexer': lexer,
'style': style,
'texcomments': texcomments,
'mathescape': mathescape,
'commandprefix': 'PYG' + style}
settingsfunc['outputdir'] = set_kv_data
settingsfunc['workingdir'] = set_kv_data
settingsfunc['rerun'] = set_kv_temp_data_if_not_default()
settingsfunc['hashdependencies'] = set_kv_temp_data_if_not_default()
settingsfunc['stderr'] = set_kv_data
settingsfunc['stderrfilename'] = set_kv_data
settingsfunc['keeptemps'] = set_kv_data
settingsfunc['pyfuture'] = set_kv_data
settingsfunc['pygments'] = set_kv_data
settingsfunc['fvextfile'] = set_kv_data_fvextfile
settingsfunc['pyglexer'] = set_kv_pygments_global
settingsfunc['pygopt'] = set_kv_pygments_global
settingsfunc['pygfamily'] = set_kv_pygments_family
settingsfunc['pyconbanner'] = set_kv_data
settingsfunc['pyconfilename'] = set_kv_data
settingsfunc['depythontex'] = set_kv_data
# Process settings
for line in pytxsettings:
# A hash symbol "#" should never be within content, but be
# careful just in case by using rsplit('#', 1)[0]
content = line.replace('=>PYTHONTEX:SETTINGS#', '', 1).rsplit('#', 1)[0]
key, val = content.split('=', 1)
try:
settingsfunc[key](key, val)
except KeyError:
print('* PythonTeX warning')
print(' Unknown option "' + content + '"')
temp_data['warnings'] += 1
# Store all results that haven't already been stored.
data['settings'] = settings
data['pygments_settings'] = pygments_settings
# #### Is there a more logical place for this?
#// Python 2
## We save the pyfuture option regardless of the Python version,
## but we only use it under Python 2.
#update_default_code2(data['settings']['pyfuture'])
#\\ End Python 2
def get_old_data(data, old_data):
'''
Load data from the last run, if it exists, into the dict old_data.
Determine the path to the PythonTeX scripts, either by using a previously
found, saved path or via kpsewhich.
The old data is used for determining when PythonTeX has been upgraded,
when any settings have changed, when code has changed (via hashes), and
what files may need to be cleaned up. The location of the PythonTeX
scripts is needed so that they can be imported by the scripts created by
PythonTeX. The location of the scripts is confirmed even if they were
previously located, to make sure that the path is still valid. Finding
the scripts depends on having a TeX installation that includes the
Kpathsea library (TeX Live and MiKTeX, possibly others).
All code that relies on old_data is written based on the assumption that
if old_data exists and has the current PythonTeX version, then it
contains all needed information. Thus, all code relying on old_data must
check that it was loaded and that it has the current version. If not,
code should adapt gracefully.
'''
# Create a string containing the name of the data file
outputdir = data['settings']['outputdir']
pythontex_data_file = os.path.join(outputdir, 'pythontex_data.pkl')
# Create a string containing the name of the pythontex_utils*.py file
# Note that the file name depends on the Python version
pythontex_utils_file = 'pythontex_utils' + str(sys.version_info[0]) + '.py'
# Load the old data if it exists (read as binary pickle)
if os.path.isfile(pythontex_data_file):
f = open(pythontex_data_file, 'rb')
old_data.update(pickle.load(f))
f.close()
temp_data['loaded_old_data'] = True
else:
temp_data['loaded_old_data'] = False
# Set the scriptpath in the current data
if temp_data['loaded_old_data'] and os.path.isfile(os.path.join(old_data['scriptpath'], pythontex_utils_file)):
data['scriptpath'] = old_data['scriptpath']
else:
exec_cmd = ['kpsewhich', '--format', 'texmfscripts', pythontex_utils_file]
try:
# Get path, convert from bytes to unicode, and strip off eol
# characters
# #### Is there a better approach for decoding, in case of non utf-8?
scriptpath_full = subprocess.check_output(exec_cmd).decode('utf-8').rstrip('\r\n')
except OSError:
print('* PythonTeX error')
print(' Your system appears to lack kpsewhich.')
return sys.exit(1)
except subprocess.CalledProcessError:
print('* PythonTeX error')
print(' kpsewhich is not happy with its arguments.')
print(' This command was attempted:')
print(' ' + ' '.join(exec_cmd))
return sys.exit(1)
# Split off the end of the path ("/pythontex_utils*.py")
scriptpath = os.path.split(scriptpath_full)[0]
data['scriptpath'] = scriptpath
# Set path for scripts, via the function from pythontex_types*.py
# #### More logical location?
set_utils_location(data['scriptpath'])
def hash_code(data, temp_data, old_data, typedict):
'''
Hash the code to see what has changed and needs to be updated.
Save the hashes in hashdict. Create update_code, a list of bools
regarding whether code should be executed. Create update_pygments, a
list of bools determining what needs updated Pygments highlighting.
Update pygments_settings to account for Pygments (as opposed to PythonTeX)
commands and environments.
'''
# Technically, the code could be simultaneously hashed and divided into
# lists according to (type, session, group). That approach would involve
# some unnecessary list creation and text parsing, but would also have
# the advantage of only handling everything once. The current approach
# is based on simplicity. No speed tests have been performed, but any
# difference between the two approaches should generally be negligible.
#
# Note that the PythonTeX information that accompanies code must be
# hashed in addition to the code itself; the code could stay the same,
# but its context could change, which might require that context-dependent
# code be executed. All of the PythonTeX information is hashed except
# for the input line number. Context-dependent code is going too far if
# it depends on that.
# Create variables to more easily access parts of data
pytxcode = temp_data['pytxcode']
encoding = data['encoding']
loaded_old_data = temp_data['loaded_old_data']
rerun = temp_data['rerun']
hashdependencies = temp_data['hashdependencies']
# Calculate hashes for each set of code (type, session, group).
# We don't have to skip the lines of settings in the code file, because
# they have already been removed.
hasher = defaultdict(sha1)
for codeline in pytxcode:
# Detect the start of a new command/environment
# Switch variables if so
if codeline.startswith('=>PYTHONTEX#'):
inputtype, inputsession, inputgroup = codeline.split('#', 4)[1:4]
currentkey = inputtype + '#' + inputsession + '#' + inputgroup
# If dealing with an external file
if inputsession.startswith('EXT:'):
# We use os.path.normcase to make sure slashes are
# appropriate, thus allowing code in subdirectories to be
# specified
extfile = os.path.normcase(inputsession.replace('EXT:', '', 1))
if not os.path.isfile(extfile):
print('* PythonTeX error')
print(' Cannot find external file ' + extfile)
return sys.exit(1)
# Hash either file contents or mtime
if hashdependencies:
# Read and hash the file in binary. Opening in text mode
# would require an unnecessary decoding and encoding cycle.
f = open(extfile, 'rb')
hasher[currentkey].update(f.read())
f.close()
else:
hasher[currentkey].update(str(os.path.getmtime(extfile)))
# If not dealing with an external file, hash part of code info
else:
# We need to hash most of the code info, because code needs
# to be executed again if anything but the line number changes.
# The text must be encoded to bytes for hashing.
hasher[currentkey].update(codeline.rsplit('#', 2)[0].encode(encoding))
else:
# The text must be encoded to bytes for hashing
hasher[currentkey].update(codeline.encode(encoding))
# Create a dictionary of hashes, in string form
# For PythonTeX (as opposed to Pygments) content, the hashes should
# include the default code, just in case it is ever changed for any reason.
# Based on the order in which the code will be executed, default code
# should technically be hashed first. But we don't know ahead of time
# what entries will be in the hashdict, so we hash it afterward. The
# result is the same, since we get a unique hash. We must also account
# for custom code. This is more awkward, since we don't yet have it in
# a centralized location where we can just add it to the hash. But we do
# have a hash of the custom code, so we just store that with the main hash.
hashdict = dict()
for key in hasher:
inputtype = key.split('#', 1)[0]
if inputtype.startswith('PYG') or inputtype.startswith('CC:'):
hashdict[key] = hasher[key].hexdigest()
else:
hasher[key].update(''.join(typedict[inputtype].default_code).encode(encoding))
for key in hasher:
if not key.startswith('PYG') and not key.startswith('CC:'):
inputtype = key.split('#', 1)[0]
cc_begin_key = 'CC:' + inputtype + ':begin#none#none'
if cc_begin_key in hashdict:
cc_begin_hash = hashdict[cc_begin_key]
else:
cc_begin_hash = ''
cc_end_key = 'CC:' + inputtype + ':end#none#none'
if cc_end_key in hashdict:
cc_end_hash = hashdict[cc_end_key]
else:
cc_end_hash = ''
hashdict[key] = ':'.join([hasher[key].hexdigest(), cc_begin_hash, cc_end_hash])
# Delete the hasher so it can't be accidentally used instead of hashdict
del hasher
# Save the hashdict into data.
data['hashdict'] = hashdict
# See what needs to be updated.
# In the process, copy over macros and files that may be reused.
update_code = dict()
macros = defaultdict(list)
files = defaultdict(list)
dependencies = defaultdict(list)
exit_status = dict()
# We need a function for checking if dependencies have changed.
# We could just always create an updated dict of dependency hashes/mtimes,
# but that's a waste if the code itself has been changed, particularly if
# we are hashing code rather than just using mtimes.
def unchanged_dependencies(key, data, temp_data, old_data):
if key in old_data['dependencies']:
old_dependencies_hashdict = old_data['dependencies'][key]
dependencies_hasher = defaultdict(sha1)
workingdir = data['settings']['workingdir']
missing = False
for dep in old_dependencies_hashdict:
# We need to know if the path is relative (based off the
# working directory) or absolute. We can't use
# os.path.isabs() alone for determining the distinction,
# because we must take into account the possibility of an
# initial ~ (tilde) standing for the home directory.
dep_file = os.path.expanduser(os.path.normcase(dep))
if not os.path.isabs(dep_file):
dep_file = os.path.join(workingdir, dep_file)
if not os.path.isfile(dep_file):
print('* PythonTeX error')
print(' Cannot find dependency "' + dep + '"')
print(' It belongs to ' + ':'.join(key.split('#')))
print(' Relative paths to dependencies must be specified from the working directory.')
temp_data['errors'] += 1
missing = True
elif hashdependencies:
# Read and hash the file in binary. Opening in text mode
# would require an unnecessary decoding and encoding cycle.
f = open(dep_file, 'rb')
dependencies_hasher[dep].update(f.read())
f.close()
else:
dependencies_hasher[dep].update(str(os.path.getmtime(dep_file)))
dependencies_hashdict = dict()
for dep in dependencies_hasher:
dependencies_hashdict[dep] = dependencies_hasher[dep].hexdigest()
if missing:
# Return True so that code doesn't run again; there's no
# point in running it, because we would just get the same
# error back in a different form.
return True
else:
return dependencies_hashdict == old_dependencies_hashdict
else:
return True
# We need a function for determining if exit status requires rerun
# The 'all' and 'modified' cases are technically resolved without actually
# using the function. We also rerun all sessions that produced errors or
# warnings the last time if the stderrfilename has changed.
# #### There is probably a better way to handlestderrfilename, perhaps by
# modifying rerun based on it.
def make_do_not_rerun():
if (loaded_old_data and
data['settings']['stderrfilename'] != old_data['settings']['stderrfilename']):
def func(status):
if status[0] != 0 or status[1] != 0:
return False
else:
return True
elif rerun == 'modified':
def func(status):
return True
elif rerun == 'errors':
def func(status):
if status[0] != 0:
return False
else:
return True
elif rerun == 'warnings':
def func(status):
if status[0] != 0 or status[1] != 0:
return False
else:
return True
elif rerun == 'all':
def func(status):
return False
return func
do_not_rerun = make_do_not_rerun()
# If old data was loaded, and it contained sufficient information, and
# settings are compatible, determine what has changed so that only
# modified code may be executed. Otherwise, execute everything.
# We don't have to worry about checking for changes in pyfuture, because
# custom code and default code are hashed. The treatment of keeptemps
# could be made more efficient (if changed to 'none', just delete old temp
# files rather than running everything again), but given that it is
# intended as a debugging aid, that probable isn't worth it.
# We don't have to worry about hashdependencies changing, because if it
# does the hashes won't match (file contents vs. mtime) and thus code will
# be re-executed.
if (rerun != 'all' and loaded_old_data and
'version' in old_data and
data['version'] == old_data['version'] and
data['encoding'] == old_data['encoding'] and
data['settings']['workingdir'] == old_data['settings']['workingdir'] and
data['settings']['keeptemps'] == old_data['settings']['keeptemps']):
old_hashdict = old_data['hashdict']
old_macros = old_data['macros']
old_files = old_data['files']
old_dependencies = old_data['dependencies']
old_exit_status = old_data['exit_status']
# Compare the hash values, and set which code needs to be run
for key in hashdict:
if key.startswith('CC:'):
pass
elif (key.startswith('PYG') and
key in old_hashdict and
hashdict[key] == old_hashdict[key]):
update_code[key] = False
elif (key in old_hashdict and hashdict[key] == old_hashdict[key] and
do_not_rerun(old_exit_status[key]) and
unchanged_dependencies(key, data, temp_data, old_data)):
update_code[key] = False
exit_status[key] = old_exit_status[key]
if key in old_macros:
macros[key] = old_macros[key]
if key in old_files:
files[key] = old_files[key]
if key in old_dependencies:
dependencies[key] = old_dependencies[key]
else:
update_code[key] = True
else:
for key in hashdict:
if not key.startswith('CC:'):
update_code[key] = True
# Save to data
temp_data['update_code'] = update_code
data['macros'] = macros
data['files'] = files
data['dependencies'] = dependencies
data['exit_status'] = exit_status
# Now that the code that needs updating has been determined, figure out
# what Pygments content needs updating. These are two separate tasks,
# because the code may stay the same but may still need to be highlighted
# if the Pygments settings have changed.
#
# Before determining what Pygments content needs updating, we must check
# for the use of the Pygments commands and environment (as opposed to
# PythonTeX ones), and assign proper Pygments settings if necessary.
# Unlike regular PythonTeX commands and environments, the Pygments
# commands and environment don't automatically create their own Pygments
# settings in the code file. This is because we can't know ahead of time
# which lexers will be needed; these commands and environments take a
# lexer name as an argument. We can only do this now, since we need the
# set of unique (type, session, group).
#
# Any Pygments inputtype that appears will be pygments_macros; otherwise, it
# wouldn't have ever been written to the code file.
# #### Now that settings are at the end of .pytxcode, this could be
# shifted back to the TeX side. It probably should be made uniform, one
# way or another, for the case where pygopt is not used.
pygments_settings = data['pygments_settings']
for key in hashdict:
inputtype = key.split('#', 1)[0]
if inputtype.startswith('PYG') and inputtype not in pygments_settings:
lexer = inputtype.replace('PYG', '', 1)
style = 'default'
texcomments = False
mathescape = False
pygments_settings[inputtype] = {'lexer': lexer,
'style': style,
'texcomments': texcomments,
'mathescape': mathescape,
'commandprefix': 'PYG' + style}
# Add settings for console, based on type, if these settings haven't
# already been created by passing explicit console settings from the TeX
# side.
# #### 'cons' issues?
for key in hashdict:
if key.endswith('cons'):
inputtype = key.split('#', 1)[0]
inputtypecons = inputtype + '_cons'
# Create console settings based on the type, if console settings
# don't exist. We go ahead and define default console lexers for
# many languages, even though only Python is currently supported.
# If a compatible console lexer can't be found, default to the
# text lexer (null lexer, does nothing).
if inputtype in pygments_settings and inputtypecons not in pygments_settings:
pygments_settings[inputtypecons] = copy.deepcopy(pygments_settings[inputtype])
lexer = pygments_settings[inputtype]['lexer']
if lexer in ('Python3Lexer', 'python3', 'py3'):
pygments_settings[inputtypecons]['lexer'] = 'pycon'
pygments_settings[inputtypecons]['python3'] = True
elif lexer in ('PythonLexer', 'python', 'py'):
pygments_settings[inputtypecons]['lexer'] = 'pycon'
pygments_settings[inputtypecons]['python3'] = False
elif lexer in ('RubyLexer', 'rb', 'ruby', 'duby'):
pygments_settings[inputtypecons]['lexer'] = 'rbcon'
elif lexer in ('MatlabLexer', 'matlab'):
pygments_settings[inputtypecons]['lexer'] = 'matlabsession'
elif lexer in ('SLexer', 'splus', 's', 'r'):
pygments_settings[inputtypecons]['lexer'] = 'rconsole'
elif lexer in ('BashLexer', 'bash', 'sh', 'ksh'):
pygments_settings[inputtypecons]['lexer'] = 'console'
else:
pygments_settings[inputtypecons]['lexer'] = 'text'
# Since console content can't be typeset without the Python side
# we need to detect whether Pygments was used previously but is
# used no longer, so that we can generate a non-Pygments version.
# We need to update code and Pygments to make sure all old content
# is properly cleaned up. Also, we need to see if console
# settings have changed.
# #### All of this should possibly be done elsewhere; there may
# be a more logical location.
if loaded_old_data:
old_pygments_settings = old_data['pygments_settings']
if ((inputtypecons in old_pygments_settings and
inputtypecons not in pygments_settings) or
data['settings']['pyconbanner'] != old_data['settings']['pyconbanner'] or
data['settings']['pyconfilename'] != old_data['settings']['pyconfilename']):
update_code[key] = True
# The global Pygments settings are no longer needed, so we delete them.
# #### Might be a better place to do this, if things earlier are rearranged
# #### Also, this needs list due to Python 3 ... may be a better approach
k = list(pygments_settings.keys())
for s in k:
if s != '#GLOBAL':
pygments_settings[s].update(pygments_settings['#GLOBAL'])
if '#GLOBAL' in pygments_settings:
del pygments_settings['#GLOBAL']
# Now we create a dictionary of whether pygments content needs updating.
# The first set of conditions is identical to that for update_code,
# except that workingdir and keeptemps don't have an effect on
# highlighting. We also create the TeX style defitions for different
# Pygments styles.
update_pygments = dict()
pygments_macros = defaultdict(list)
pygments_files = defaultdict(list)
pygments_style_defs = dict()
fvextfile = data['settings']['fvextfile']
if (loaded_old_data and 'version' in old_data and
data['version'] == old_data['version'] and
data['encoding'] == old_data['encoding']):
old_hashdict = old_data['hashdict']
old_pygments_settings = old_data['pygments_settings']
old_pygments_macros = old_data['pygments_macros']
old_pygments_files = old_data['pygments_files']
old_fvextfile = old_data['settings']['fvextfile']
old_pygments_style_defs = old_data['pygments_style_defs']
for key in hashdict:
if not key.startswith('CC:'):
inputtype = key.split('#', 1)[0]
if key.endswith('cons'):
inputtype += '_cons'
# Pygments may not apply to content
if inputtype not in pygments_settings:
update_pygments[key] = False
# Pygments may apply, but have been done before for identical code
# using identical settings
elif (update_code[key] == False and
inputtype in old_pygments_settings and
pygments_settings[inputtype] == old_pygments_settings[inputtype] and
fvextfile == old_fvextfile):
update_pygments[key] = False
if key in old_pygments_macros:
pygments_macros[key] = old_pygments_macros[key]
if key in old_pygments_files:
pygments_files[key] = old_pygments_files[key]
else:
update_pygments[key] = True
for codetype in pygments_settings:
pygstyle = pygments_settings[codetype]['style']
if pygstyle not in pygments_style_defs:
if pygstyle in old_pygments_style_defs:
pygments_style_defs[pygstyle] = old_pygments_style_defs[pygstyle]
else:
commandprefix = pygments_settings[codetype]['commandprefix']
formatter = LatexFormatter(style=pygstyle, commandprefix=commandprefix)
pygments_style_defs[pygstyle] = formatter.get_style_defs()
else:
for key in hashdict:
if not key.startswith('CC:'):
inputtype = key.split('#', 1)[0]
if key.endswith('cons'):
inputtype += '_cons'
if inputtype in pygments_settings:
update_pygments[key] = True
else:
update_pygments[key] = False
for codetype in pygments_settings:
pygstyle = pygments_settings[codetype]['style']
if pygstyle not in pygments_style_defs:
commandprefix = pygments_settings[codetype]['commandprefix']
formatter = LatexFormatter(style=pygstyle, commandprefix=commandprefix)
pygments_style_defs[pygstyle] = formatter.get_style_defs()
# Save to data
temp_data['update_pygments'] = update_pygments
data['pygments_macros'] = pygments_macros
data['pygments_files'] = pygments_files
data['pygments_style_defs'] = pygments_style_defs
# Clean up old files, if possible
# Check for 'files' and 'pygments_files' keys, for upgrade purposes
# #### Might be able to clean this up a bit, especially if redo some Pygments
if (loaded_old_data and
'files' in old_data and
'pygments_files' in old_data):
# Clean up for code that will be run again, and for code that no
# longer exists. We use os.path.normcase() to fix slashes in the path
# name, in an attempt to make saved paths platform-independent.
old_hashdict = old_data['hashdict']
old_files = old_data['files']
old_pygments_files = old_data['pygments_files']
for key in hashdict:
if not key.startswith('CC:'):
if update_code[key]:
if key in old_files:
for f in old_files[key]:
f = os.path.expanduser(os.path.normcase(f))
if os.path.isfile(f):
os.remove(f)
if key in old_pygments_files:
for f in old_pygments_files[key]:
f = os.path.expanduser(os.path.normcase(f))
if os.path.isfile(f):
os.remove(f)
elif update_pygments[key] and key in old_pygments_files:
for f in old_pygments_files[key]:
f = os.path.expanduser(os.path.normcase(f))
if os.path.isfile(f):
os.remove(f)
for key in old_hashdict:
if key not in hashdict:
for f in old_files[key]:
f = os.path.expanduser(os.path.normcase(f))
if os.path.isfile(f):
os.remove(f)
for f in old_pygments_files[key]:
f = os.path.expanduser(os.path.normcase(f))
if os.path.isfile(f):
os.remove(f)
elif loaded_old_data:
print('* PythonTeX warning')
print(' PythonTeX may not have been able to clean up old files.')
print(' This should not cause problems.')
print(' Delete the PythonTeX directory and run again to remove any unused files.')
temp_data['warnings'] += 1
def parse_code_write_scripts(data, temp_data, typedict):
'''
Parse the code file into separate scripts, based on
(type, session, groups). Write the script files.
'''
codedict = defaultdict(list)
consoledict = defaultdict(list)
# Create variables to ease data access
hashdict = data['hashdict']
outputdir = data['settings']['outputdir']
workingdir = data['settings']['workingdir']
encoding = data['encoding']
pytxcode = temp_data['pytxcode']
update_code = temp_data['update_code']
update_pygments = temp_data['update_pygments']
files = data['files']
# We need to keep track of the last instance for each session, so
# that duplicates can be eliminated. Some LaTeX environments process
# their contents multiple times and thus will create duplicates. We
# need to initialize everything at -1, since instances begin at zero.
def negonefactory():
return -1
lastinstance = defaultdict(negonefactory)
for codeline in pytxcode:
# Detect if start of new command/environment; if so, get new variables
if codeline.startswith('=>PYTHONTEX#'):
[inputtype, inputsession, inputgroup, inputinstance, inputcommand, inputcontext, inputline] = codeline.split('#')[1:8]
currentkey = inputtype + '#' + inputsession + '#' + inputgroup
currentinstance = int(inputinstance)
# We need to determine whether code needs to be placed in the
# consoledict or the codedict. In the process, we need to make
# sure that code that appears multiple times in the .pytxcode is
# only actually copied once.
addcode = False
addconsole = False
if not inputgroup.endswith('verb') and lastinstance[currentkey] < currentinstance:
lastinstance[currentkey] = currentinstance
if (inputgroup.endswith('cons') and
(update_code[currentkey] or update_pygments[currentkey])):
addconsole = True
consoledict[currentkey].append(codeline)
elif currentkey.startswith('CC:') or update_code[currentkey]:
switched = True
addcode = True
if inputcommand == 'inline':
inline = True
else:
inline = False
# Correct for line numbering in environments; content
# doesn't really start till the line after the "\begin"
inputline = str(int(inputline)+1)
if currentkey.startswith('CC:'):
inputinstance = 'customcode'
inputline += ' (in custom code)'
# Only collect for a session (and later write it to a file) if it needs to be updated
elif addconsole:
consoledict[currentkey].append(codeline)
elif addcode:
# If just switched commands/environments, associate with the input
# line and check for indentation errors
if switched:
switched = False
if inputtype.startswith('CC:'):
codedict[currentkey].append(typedict[inputtype.split(':')[1]].set_inputs_var(inputinstance, inputcommand, inputcontext, inputline))
else:
codedict[currentkey].append(typedict[inputtype].set_inputs_var(inputinstance, inputcommand, inputcontext, inputline))
# We need to make sure that each time we switch, we are
# starting out with no indentation. Technically, we could
# allow indentation to continue between commands and
# environments, but that seems like a recipe for disaster.
if codeline.startswith(' ') or codeline.startswith('\t'):
print('* PythonTeX error')
print(' Command/environment cannot begin with indentation (space or tab) near line ' + inputline)
sys.exit(1)
if inline:
codedict[currentkey].append(typedict[inputtype].inline(codeline))
else:
codedict[currentkey].append(codeline)
# Save codedict and consoledict
temp_data['codedict'] = codedict
temp_data['consoledict'] = consoledict
# Update custom code
for codetype in typedict: