You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've seen reports about crashes related to this and TensorFlow and while TF 2.2 seems to work fine TF 2.3 fails to run with the following stacktrace:
File "/home/s3248973/git/easybuild-easyconfigs/easybuild/easyconfigs/t/TensorFlow/TensorFlow-2.x_mnist-test.py", line 15, in <module>
tf.keras.layers.Dense(10, activation='softmax'),
File "/tmp/ebinstall/software/TensorFlow/2.3.0-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/tensorflow/python/training/tracking/base.py", line 457, in _method_wrapper
result = method(self, *args, **kwargs)
File "/tmp/ebinstall/software/TensorFlow/2.3.0-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/tensorflow/python/keras/engine/sequential.py", line 117, in __init__
name=name, autocast=False)
File "/tmp/ebinstall/software/TensorFlow/2.3.0-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/tensorflow/python/training/tracking/base.py", line 457, in _method_wrapper
result = method(self, *args, **kwargs)
File "/tmp/ebinstall/software/TensorFlow/2.3.0-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py", line 308, in __init__
self._init_batch_counters()
File "/tmp/ebinstall/software/TensorFlow/2.3.0-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/tensorflow/python/training/tracking/base.py", line 457, in _method_wrapper
result = method(self, *args, **kwargs)
File "/tmp/ebinstall/software/TensorFlow/2.3.0-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py", line 317, in _init_batch_counters
self._train_counter = variables.Variable(0, dtype='int64', aggregation=agg)
File "/tmp/ebinstall/software/TensorFlow/2.3.0-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/tensorflow/python/ops/variables.py", line 262, in __call__
return cls._variable_v2_call(*args, **kwargs)
File "/tmp/ebinstall/software/TensorFlow/2.3.0-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/tensorflow/python/ops/variables.py", line 256, in _variable_v2_call
shape=shape)
File "/tmp/ebinstall/software/TensorFlow/2.3.0-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/tensorflow/python/ops/variables.py", line 237, in <lambda>
previous_getter = lambda **kws: default_variable_creator_v2(None, **kws)
File "/tmp/ebinstall/software/TensorFlow/2.3.0-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/tensorflow/python/ops/variable_scope.py", line 2646, in default_variable_creator_v2
shape=shape)
File "/tmp/ebinstall/software/TensorFlow/2.3.0-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/tensorflow/python/ops/variables.py", line 264, in __call__
return super(VariableMetaclass, cls).__call__(*args, **kwargs)
File "/tmp/ebinstall/software/TensorFlow/2.3.0-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/tensorflow/python/ops/resource_variable_ops.py", line 1518, in __init__
distribute_strategy=distribute_strategy)
File "/tmp/ebinstall/software/TensorFlow/2.3.0-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/tensorflow/python/ops/resource_variable_ops.py", line 1666, in _init_from_args
graph_mode=self._in_graph_mode)
File "/tmp/ebinstall/software/TensorFlow/2.3.0-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/tensorflow/python/ops/resource_variable_ops.py", line 243, in eager_safe_variable_handle
shape, dtype, shared_name, name, graph_mode, initial_value)
File "/tmp/ebinstall/software/TensorFlow/2.3.0-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/tensorflow/python/ops/resource_variable_ops.py", line 181, in _variable_handle_from_shape_and_dtype
shape=shape.as_proto(), dtype=dtype.as_datatype_enum))
TypeError: Parameter to MergeFrom() must be instance of same class: expected tensorflow.TensorShapeProto got tensorflow.TensorShapeProto.
Removing the C++ extension makes the test succeed.
Hence we shouldn't enable this (the bundled protobuf in TensorFlow doesn't do it either)
Test report by @Flamefire SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in this PR)
taurusa4 - Linux centos linux 7.7.1908, x86_64, Intel(R) Xeon(R) CPU E5-2603 v4 @ 1.70GHz, Python 2.7.5
See https://gist.github.com/cb5b17a0941776ca974a8933d8dcbfb7 for a full test report.
@boegel: Request for testing this PR well received on generoso
PR test command 'EB_PR=11260 EB_ARGS= /apps/slurm/default/bin/sbatch --job-name test_PR_11260 ~/boegelbot/eb_from_pr_upload_generoso.sh' executed!
exit code: 0
output:
Submitted batch job 5604
Test results coming soon (I hope)...
Details
- notification for comment with ID 691190673 processed
Message to humans: this is just bookkeeping information for me,
it is of no use to you (unless you think I have a bug, which I don't).
boegel
changed the title
Revert building the protobuf-python C++ extension
Revert building the C++ extension in the protobuf-python v3.10.0 easyconfigs
Sep 11, 2020
boegel
changed the title
Revert building the C++ extension in the protobuf-python v3.10.0 easyconfigs
disable building the C++ extension in the protobuf-python v3.10.0 easyconfigs
Sep 11, 2020
Test report by @Flamefire SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in this PR)
taurusi5129.taurus.hrsk.tu-dresden.de - Linux RHEL 7.8, x86_64, Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz, Python 2.7.5
See https://gist.github.com/ac0bd1cdeb8bcd9c6cd4f5b874be25ff for a full test report.
Test report by @boegel SUCCESS
Build succeeded for 6 out of 6 (6 easyconfigs in this PR)
node3309.joltik.os - Linux centos linux 7.8.2003, x86_64, Intel(R) Xeon(R) Gold 6242 CPU @ 2.80GHz (cascadelake), Python 3.6.8
See https://gist.github.com/3575c98a1f1e059d2f53d14543aa3a2c for a full test report.
Test report by @boegelbot SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in this PR)
generoso-x-1 - Linux centos linux 8.2.2004, x86_64, Intel(R) Xeon(R) CPU E5-2667 v3 @ 3.20GHz (haswell), Python 3.6.8
See https://gist.github.com/c68801c982f22918dbd2df1813475cc4 for a full test report.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Seems like the protobuf C++ extensions are not good: https://github.com/protocolbuffers/protobuf/blob/c6493970296fa5c5b4a81a37248a328579fe9662/python/google/protobuf/internal/api_implementation.py#L69-L71
I've seen reports about crashes related to this and TensorFlow and while TF 2.2 seems to work fine TF 2.3 fails to run with the following stacktrace:
Removing the C++ extension makes the test succeed.
Hence we shouldn't enable this (the bundled protobuf in TensorFlow doesn't do it either)
edit (@boegel): reverts part of #11143