Skip to content

disable building the C++ extension in the protobuf-python v3.10.0 easyconfigs#11260

Merged
boegel merged 1 commit intoeasybuilders:developfrom
Flamefire:20200911180422_new_pr_protobuf-python3100
Sep 11, 2020
Merged

disable building the C++ extension in the protobuf-python v3.10.0 easyconfigs#11260
boegel merged 1 commit intoeasybuilders:developfrom
Flamefire:20200911180422_new_pr_protobuf-python3100

Conversation

@Flamefire
Copy link
Copy Markdown
Contributor

@Flamefire Flamefire commented Sep 11, 2020

Seems like the protobuf C++ extensions are not good: https://github.com/protocolbuffers/protobuf/blob/c6493970296fa5c5b4a81a37248a328579fe9662/python/google/protobuf/internal/api_implementation.py#L69-L71

I've seen reports about crashes related to this and TensorFlow and while TF 2.2 seems to work fine TF 2.3 fails to run with the following stacktrace:

  File "/home/s3248973/git/easybuild-easyconfigs/easybuild/easyconfigs/t/TensorFlow/TensorFlow-2.x_mnist-test.py", line 15, in <module>
    tf.keras.layers.Dense(10, activation='softmax'),
  File "/tmp/ebinstall/software/TensorFlow/2.3.0-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/tensorflow/python/training/tracking/base.py", line 457, in _method_wrapper
    result = method(self, *args, **kwargs)
  File "/tmp/ebinstall/software/TensorFlow/2.3.0-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/tensorflow/python/keras/engine/sequential.py", line 117, in __init__
    name=name, autocast=False)
  File "/tmp/ebinstall/software/TensorFlow/2.3.0-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/tensorflow/python/training/tracking/base.py", line 457, in _method_wrapper
    result = method(self, *args, **kwargs)
  File "/tmp/ebinstall/software/TensorFlow/2.3.0-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py", line 308, in __init__
    self._init_batch_counters()
  File "/tmp/ebinstall/software/TensorFlow/2.3.0-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/tensorflow/python/training/tracking/base.py", line 457, in _method_wrapper
    result = method(self, *args, **kwargs)
  File "/tmp/ebinstall/software/TensorFlow/2.3.0-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py", line 317, in _init_batch_counters
    self._train_counter = variables.Variable(0, dtype='int64', aggregation=agg)
  File "/tmp/ebinstall/software/TensorFlow/2.3.0-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/tensorflow/python/ops/variables.py", line 262, in __call__
    return cls._variable_v2_call(*args, **kwargs)
  File "/tmp/ebinstall/software/TensorFlow/2.3.0-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/tensorflow/python/ops/variables.py", line 256, in _variable_v2_call
    shape=shape)
  File "/tmp/ebinstall/software/TensorFlow/2.3.0-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/tensorflow/python/ops/variables.py", line 237, in <lambda>
    previous_getter = lambda **kws: default_variable_creator_v2(None, **kws)
  File "/tmp/ebinstall/software/TensorFlow/2.3.0-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/tensorflow/python/ops/variable_scope.py", line 2646, in default_variable_creator_v2
    shape=shape)
  File "/tmp/ebinstall/software/TensorFlow/2.3.0-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/tensorflow/python/ops/variables.py", line 264, in __call__
    return super(VariableMetaclass, cls).__call__(*args, **kwargs)
  File "/tmp/ebinstall/software/TensorFlow/2.3.0-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/tensorflow/python/ops/resource_variable_ops.py", line 1518, in __init__
    distribute_strategy=distribute_strategy)
  File "/tmp/ebinstall/software/TensorFlow/2.3.0-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/tensorflow/python/ops/resource_variable_ops.py", line 1666, in _init_from_args
    graph_mode=self._in_graph_mode)
  File "/tmp/ebinstall/software/TensorFlow/2.3.0-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/tensorflow/python/ops/resource_variable_ops.py", line 243, in eager_safe_variable_handle
    shape, dtype, shared_name, name, graph_mode, initial_value)
  File "/tmp/ebinstall/software/TensorFlow/2.3.0-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/tensorflow/python/ops/resource_variable_ops.py", line 181, in _variable_handle_from_shape_and_dtype
    shape=shape.as_proto(), dtype=dtype.as_datatype_enum))
TypeError: Parameter to MergeFrom() must be instance of same class: expected tensorflow.TensorShapeProto got tensorflow.TensorShapeProto.

Removing the C++ extension makes the test succeed.

Hence we shouldn't enable this (the bundled protobuf in TensorFlow doesn't do it either)

edit (@boegel): reverts part of #11143

@Flamefire
Copy link
Copy Markdown
Contributor Author

Test report by @Flamefire
SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in this PR)
taurusml24 - Linux RHEL 7.6, POWER, 8335-GTX, Python 2.7.5
See https://gist.github.com/6562424e56a5cc52d0ae2f750da971e3 for a full test report.

@Flamefire
Copy link
Copy Markdown
Contributor Author

Test report by @Flamefire
SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in this PR)
taurusa4 - Linux centos linux 7.7.1908, x86_64, Intel(R) Xeon(R) CPU E5-2603 v4 @ 1.70GHz, Python 2.7.5
See https://gist.github.com/cb5b17a0941776ca974a8933d8dcbfb7 for a full test report.

@boegel boegel added this to the next release (4.3.0) milestone Sep 11, 2020
@boegel boegel added the bug fix label Sep 11, 2020
@boegel
Copy link
Copy Markdown
Member

boegel commented Sep 11, 2020

@boegelbot please test @ generoso

@boegelbot
Copy link
Copy Markdown
Collaborator

@boegel: Request for testing this PR well received on generoso

PR test command 'EB_PR=11260 EB_ARGS= /apps/slurm/default/bin/sbatch --job-name test_PR_11260 ~/boegelbot/eb_from_pr_upload_generoso.sh' executed!

  • exit code: 0
  • output:
Submitted batch job 5604

Test results coming soon (I hope)...

Details

- notification for comment with ID 691190673 processed

Message to humans: this is just bookkeeping information for me,
it is of no use to you (unless you think I have a bug, which I don't).

@boegel boegel changed the title Revert building the protobuf-python C++ extension Revert building the C++ extension in the protobuf-python v3.10.0 easyconfigs Sep 11, 2020
@boegel boegel changed the title Revert building the C++ extension in the protobuf-python v3.10.0 easyconfigs disable building the C++ extension in the protobuf-python v3.10.0 easyconfigs Sep 11, 2020
@Flamefire
Copy link
Copy Markdown
Contributor Author

Test report by @Flamefire
SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in this PR)
taurusi5129.taurus.hrsk.tu-dresden.de - Linux RHEL 7.8, x86_64, Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz, Python 2.7.5
See https://gist.github.com/ac0bd1cdeb8bcd9c6cd4f5b874be25ff for a full test report.

@boegel
Copy link
Copy Markdown
Member

boegel commented Sep 11, 2020

Test report by @boegel
SUCCESS
Build succeeded for 6 out of 6 (6 easyconfigs in this PR)
node3309.joltik.os - Linux centos linux 7.8.2003, x86_64, Intel(R) Xeon(R) Gold 6242 CPU @ 2.80GHz (cascadelake), Python 3.6.8
See https://gist.github.com/3575c98a1f1e059d2f53d14543aa3a2c for a full test report.

Copy link
Copy Markdown
Member

@boegel boegel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@boegel
Copy link
Copy Markdown
Member

boegel commented Sep 11, 2020

Going in, thanks @Flamefire!

@boegel boegel merged commit 7f4c658 into easybuilders:develop Sep 11, 2020
boegel added a commit to migueldiascosta/easybuild-easyconfigs that referenced this pull request Sep 11, 2020
@boegelbot
Copy link
Copy Markdown
Collaborator

Test report by @boegelbot
SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in this PR)
generoso-x-1 - Linux centos linux 8.2.2004, x86_64, Intel(R) Xeon(R) CPU E5-2667 v3 @ 3.20GHz (haswell), Python 3.6.8
See https://gist.github.com/c68801c982f22918dbd2df1813475cc4 for a full test report.

@Flamefire Flamefire deleted the 20200911180422_new_pr_protobuf-python3100 branch September 14, 2020 07:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants