Skip to content

Flow allocator with bandwidth reservation#91

Open
autrimpo wants to merge 12 commits intorlite:masterfrom
autrimpo:fa-bw-res
Open

Flow allocator with bandwidth reservation#91
autrimpo wants to merge 12 commits intorlite:masterfrom
autrimpo:fa-bw-res

Conversation

@autrimpo
Copy link
Copy Markdown
Contributor

No description provided.

@autrimpo
Copy link
Copy Markdown
Contributor Author

Hi @vmaffione, do you want me to do some refactoring and cleanup before doing a code review?

@vmaffione
Copy link
Copy Markdown
Collaborator

Yes if you want your code to be merged. You can have a commit introducing the changes affecting existing functionalities, and another one that actually introduces the bwres mechanism. Also you need to fix integration tests (see "Details" on the travis info above).

@autrimpo
Copy link
Copy Markdown
Contributor Author

Okay, I'll start working on it.

@autrimpo autrimpo force-pushed the fa-bw-res branch 2 times, most recently from a01ca11 to bfced06 Compare September 18, 2019 08:25
@autrimpo
Copy link
Copy Markdown
Contributor Author

Hopefully this looks okay, let me know if you want any more changes.

@vmaffione
Copy link
Copy Markdown
Collaborator

Thanks, I'll let you know asap.

Comment thread user/libs/ker-numtables.c Outdated
Comment thread user/libs/utils.c Outdated
Comment thread README.md Outdated
Comment thread demo/libdemo.py Outdated
Comment thread user/uipcps/uipcp-container.c Outdated
Comment thread user/uipcps/uipcp-container.c Outdated
Comment thread user/uipcps/uipcp-normal-lfdb.hpp Outdated
Comment thread user/uipcps/uipcp-normal-lfdb.cpp Outdated
Comment thread user/uipcps/uipcp-normal-lfdb.cpp Outdated
Comment thread user/uipcps/uipcp-normal-lfdb.cpp Outdated
@autrimpo
Copy link
Copy Markdown
Contributor Author

autrimpo commented Oct 1, 2019

I've fixed the issues you've mentioned.
Sorry for the delay but the graph algorithm was getting stuck in a loop, as some of the semantics have changed with the removeal of .from, so I had to write the algorithm down and work it through step by step to figure out how to change it.

The pipeline fails, which seems to be related to this bug. Would it be possible to upgrade the gcc version used in the pipeline?

@vmaffione
Copy link
Copy Markdown
Collaborator

Thanks a lot. Take your time.
Unfortunately we don't have much control on the travis build environment. However, this is not a problem, as the {0} initializer is not in C99 (that we are using). Can you please replace that with memset(&ifr, 0, sizeof(ifr));?

Comment thread user/libs/ker-numtables.c Outdated
Comment thread user/libs/utils.c Outdated
Comment thread user/uipcps/uipcp-normal-flow-alloc.cpp Outdated
Comment thread user/uipcps/uipcp-normal-flow-alloc.cpp
@autrimpo
Copy link
Copy Markdown
Contributor Author

autrimpo commented Oct 2, 2019

I'll submit another PR after this one adding all the config-generated files to gitignore, is that okay?

@vmaffione
Copy link
Copy Markdown
Collaborator

vmaffione commented Oct 2, 2019

Yes, feel free to open that whenever you want. That one can go in immediately.

Comment thread user/uipcps/uipcp-normal-lfdb.hpp Outdated
@autrimpo
Copy link
Copy Markdown
Contributor Author

Pushed an amended commit fixing the constructor issue.

Copy link
Copy Markdown
Collaborator

@vmaffione vmaffione left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First round of review.
Still need to look at the core logic.

Comment thread README.md Outdated
Comment thread README.md Outdated
Comment thread CMakeLists.txt
Comment thread user/uipcps/lfdb-bw-test.cpp Outdated
Comment thread user/uipcps/lfdb-bw-test.cpp Outdated
Comment thread user/uipcps/uipcp-container.c
Comment thread user/uipcps/uipcp-container.c
Comment thread user/uipcps/uipcp-container.c
Comment thread user/uipcps/uipcp-container.h Outdated
Comment thread user/uipcps/uipcp-normal-flow-alloc.cpp Outdated
@vmaffione
Copy link
Copy Markdown
Collaborator

I am done for now, but since the change is complex and I already dropped many comments, I would like to give a further pass once you address my comments.

A general question: I assume (and it looks like) the implementation follows the guidelines that we agreed on the "Bw allocation scheme" google doc. Can you confirm?

One big thing that is missing in this pull request is an integration test, maybe something similar to
tests/integration/0027-ceft-addralloc.sh. Such a test would ensure that we never break your code in the future. With so much complexity, it's very likely that a future unrelated change will break it (it happened multiple times in the past, before I introduced the integration tests framework).
Even before that, it would ensure that none of the changes you will make to address my comments will break it. Maybe you can just prepare a test that reproduces the scenario that you setup manually to test your code (I guess it's a 4 or 5 nodes scenario).

@autrimpo
Copy link
Copy Markdown
Contributor Author

autrimpo commented Nov 5, 2019

I see, in that case I will start working on the issues.

Yes, the implementation should be consistent with the final version of the scheme we agreed on in the google doc.

The test scenario used 5 nodes, I should be able to find the demonstrator config as well as possibly some scripts I used to test the implementation. I will begin by creating an integration test.

@vmaffione
Copy link
Copy Markdown
Collaborator

Excellent, thanks. Feel free to add the demonstrator config file to the example if you think it is in shape.

@autrimpo
Copy link
Copy Markdown
Contributor Author

Hi Vincenzo, I hope you're doing well.

Sorry for the long delay; I've started working on the PR again. I've resolved all the comments where I worked in your changes, except the one with #if 0 - I'm not sure I understood you correctly.
I will try to create an integration test asap and after that push my commits.

@vmaffione
Copy link
Copy Markdown
Collaborator

Hi,
Sounds good! Please also merge the latest master in your branch. There have been some minor changes, including a bug in your unit test (triggered by a ./configure --debug build).

@autrimpo
Copy link
Copy Markdown
Contributor Author

autrimpo commented Apr 4, 2020

I've pushed a partial integration test, based on 0027 and 0028. I know it's not possible to set the interface speed on veth interfaces with ethtool, but they do report 10Gbps when created and checked manually. As you can see, dif-rib-show reports 0 for all the lower flows. Any clues why that might be? I've checked in the demonstrator with qemu and the speeds are reported correctly there.

(Ignore the build error, I'll look into that.)

autrimpo added 5 commits April 4, 2020 21:31
Implements a FA and a Routing policy that is bandwidth-aware.
A flow request is accepted only if there is enough bandwidth available.
Each flow is guaranteed to have access to the bandwidth it requested during the whole duration of the flow.
@autrimpo
Copy link
Copy Markdown
Contributor Author

autrimpo commented Apr 4, 2020

I can't reproduce the compilation error with local gcc 5.5.0 (nor the system 9.2.0). I thought rebasing onto current master would help but no luck. What I find curious is that the reconfigure function is practically identical with the CentralizedFaultTolerantDFT::reconfigure() in master except policy parameter prefixes - and master builds fine.

@vmaffione
Copy link
Copy Markdown
Collaborator

I've pushed a partial integration test, based on 0027 and 0028. I know it's not possible to set the interface speed on veth interfaces with ethtool, but they do report 10Gbps when created and checked manually. As you can see, dif-rib-show reports 0 for all the lower flows. Any clues why that might be? I've checked in the demonstrator with qemu and the speeds are reported correctly there.

(Ignore the build error, I'll look into that.)

With QEMU interfaces are virtio-net or e1000, and not veth.
I don't know why you get 0 as interface speed. Have you tried to log the return value of ethtool_cmd_speed() to see how many times it is called, and what returns on each call? Maybe it's called too early. Have you tried to write a stand-alone program to test the uipcp_get_if_speed and see what happens with veths? In any case looking here
https://stackoverflow.com/questions/2872058/get-link-speed-programmatically
there are some minor differences with your code, and you may want to look at those. It is also possible to read from /sys/class/net/eth0/speed.

@vmaffione
Copy link
Copy Markdown
Collaborator

I can't reproduce the compilation error with local gcc 5.5.0 (nor the system 9.2.0). I thought rebasing onto current master would help but no luck. What I find curious is that the reconfigure function is practically identical with the CentralizedFaultTolerantDFT::reconfigure() in master except policy parameter prefixes - and master builds fine.

Look at the includes. I think you are missing at least <algorithm>. In any case, I agree the error message is really confusing....

Comment thread tests/integration/0031-bw-res-flowalloc.sh
Comment on lines -684 to -686
if (f.second->lower_ipcp_id != (rl_ipcp_id_t)RL_IPCP_ID_NONE) {
continue;
}
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was the offending part, seems like I misunderstood your original comment.
Does the current version make sense?
If we don't have a lower_ipcp_id, we try to look it up (and possibly return early), else we use the value we already have from the iterator.

Copy link
Copy Markdown
Collaborator

@vmaffione vmaffione Apr 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should invert the logic:

if (f.second->lower_ipcp_id == (rl_ipcp_id_t)RL_IPCP_ID_NONE) {
            continue;
     }

The lower ipcp id of a lower flow is not something that you try to look up. Either it's there (since when the NeighFlow object was built), or it will never be there.
Btw, lower ipcp id will be always valid if you don't enable the reliable-n-flows parameter.

I'll try to rephrase my original comment. Since you run a loop with the goal of calling lf->set_bw_total() and lf->set_bw_free(), I think you should check that you actually called that, i.e. check that after the loop bw_free and bw_total are not 0. What happens for a normal IPCP that has a normal IPCP as an N-1-DIF (supporting diff)? In this case I think you would have bw_free and bw_total set to 0. How do you behave in this case? If this case is not supported yet, you should probably prevent the installation of the bwres policy for normal IPCPs that are not running over shims.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this case is not supported yet. I'll look into preventing the policy from being installed.

Comment thread tests/integration/0031-bw-res-flowalloc.sh
@autrimpo
Copy link
Copy Markdown
Contributor Author

I've added a working integration test.

However, it seems that if a node is both the endpoint of a flow and the Raft leader, it gets the OpcodeFree message twice for the same flow id, which would double free and crash the node.
For now I've added an additional check when freeing to avoid this, as I haven't managed to track down why exactly this is happening.

Additionally, due to having to test with 10Gbps interfaces, I converted the bandwidth values in the policy to 64 bits else it would overflow.
What is left unchanged is

  • dtcp_config.bandwidth in rlite/common.h- I wasn't sure if this was implementation defined or spec defined so I left it alone for now
  • passing policy parameter values throught rlite-ctl

@vmaffione
Copy link
Copy Markdown
Collaborator

vmaffione commented Apr 11, 2020 via email

@vmaffione
Copy link
Copy Markdown
Collaborator

I've added a working integration test.

However, it seems that if a node is both the endpoint of a flow and the Raft leader, it gets the OpcodeFree message twice for the same flow id, which would double free and crash the node.
For now I've added an additional check when freeing to avoid this, as I haven't managed to track down why exactly this is happening.

Additionally, due to having to test with 10Gbps interfaces, I converted the bandwidth values in the policy to 64 bits else it would overflow.

Yes, bandwidth in bps must be 4 bits.

What is left unchanged is

* `dtcp_config.bandwidth` in `rlite/common.h`- I wasn't sure if this was implementation defined or spec defined so I left it alone for now

No, that's simply an issue. The spec struct is already 64 bits (include/rina/api.h) Please change the bandwidth value to 64 bit in rlite/common.h. You also need to update the kernel and bump the control API version number (RL_API_VERSION), since this change breaks the ABI.

* passing policy parameter values throught `rlite-ctl`

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants