Skip to content

Accounts in Database + Account Manager#2424

Open
neskk wants to merge 29 commits intoRocketMap:developfrom
neskk:pr-accounts-db
Open

Accounts in Database + Account Manager#2424
neskk wants to merge 29 commits intoRocketMap:developfrom
neskk:pr-accounts-db

Conversation

@neskk
Copy link
Copy Markdown
Contributor

@neskk neskk commented Jan 6, 2018

[Beta/Safe Stage] - Mostly tested, need more feedback.
Reworked pretty much every place where accounts are used and created a single entity responsible for handling all operations: an account manager.

Description

Accounts are now saved in the database, their status is kept updated.
AccountManager handles all the queues and is responsible for maintaining account information updated.
When an error/captcha is encountered accounts will remain allocated to current instance - preserving old behavior.

  • Detects possible duplicate instance launches. This is an attempt to prevent accounts from being allocated to two instances at the same time. See discussion bellow. Instance ID should unique to your configuration. When restarting an instance, allow it to cool-off for 10 seconds, then the error message won't be displayed.
  • Captcha code refactored into the account manager - generic way of handling captchas and reduced duplicate code.
  • Shadow ban detection and handling. --shadow-ban-scan: true allows these accounts to continue scanning. Note: detection is never disabled and will record no-rares in thread status.
    • Updated common Pokemon list.
  • --hlvl-workers controls the number of spare high-level accounts that account manager will keep allocated. If you had a CSV with 3 accounts, you can achieve the same effect with -hw 3.
  • --hlvl-workers-holding-time Controls how long high-level accounts can remain idle. After this time they will be deallocated and can be picked up by other instances. Defaults to 120 seconds.
  • "On-the-fly" account allocation is enabled by default: --hlvl-workers 0 or -hw 0
    • When we want to encounter a pokemon, account manager will attempt to allocate an account from the database if there's no spare high-level accounts available.
    • --hlvl-workers-holding-time 0 if set to 0, account manager will immediately release the account after it's used, the account is released back to the spare account pool.
    • Important: If your configuration has bursts of encounters to process, having "on-the-fly" allocation would result in many high-level accounts being allocated and then moved to spare queue. Even with --hlvl-workers-holding-time set to 0, this would be inefficient for setups with large volume of encounters because it would end up recreating the API object every single time, not "respecting" --no-api-store (repeat the whole login simulation process = many requests).
  • --hlvl-workers-max is set by default to 2, and it controls the maximum number of high-level accounts that can be allocated at any given time to current instance. This setting is only used when --hlvl-workers is 0 ("on-the-fly" allocation) to prevent a burst of encounters from suddenly allocate all the accounts in the database.

Motivation and Context

Managing CSV files with accounts is kinda boring.
Account handling code was scattered across the project, multiple queues and variables were required, now it should be much simpler to control accounts, allowing more complex logic.

TODOs:

  • Web endpoint to display accounts in the database and provide administration interface:
    • Insert accounts (import accounts from a CSV through web-browser)
    • Export accounts to CSV file.
    • Clear all accounts from the database.
  • (Optional) Provide a configuration parameter to control how account rotation works - currently hard-coded to maximize account re-usage.
  • (Optional) Configurable parameter to allow captchas and failed accounts to be immediately released from instance - they won't remain allocated so they won't show up on status printer, but can allow to have a dedicated instance just to solve captchas.
  • (Optional) Stuff like device_info, proxy_url and perhaps asset_download_status (future work - cache downloaded assets in json) could be kept in the database to maximize the consistency of usage.

How Has This Been Tested?

Production environment with MySQL and 40 instances sharing a 10k account pool.

Screenshots (if appropriate):

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.

@sebastienvercammen sebastienvercammen added this to the 4.1.1 milestone Jan 6, 2018
Alderon86 added a commit to Alderon86/RocketMap that referenced this pull request Jan 6, 2018
Alderon86 added a commit to Alderon86/RocketMap that referenced this pull request Jan 7, 2018
Alderon86 added a commit to Alderon86/RocketMap that referenced this pull request Jan 7, 2018
Alderon86 added a commit to Alderon86/RocketMap that referenced this pull request Jan 7, 2018
@Alderon86 Alderon86 mentioned this pull request Jan 7, 2018
5 tasks
Comment thread pogom/models.py Outdated

# Make a GMO request before the encounter.
if using_db_account:
response = gmo(hlvl_api, hlvl_account, scan_location)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't do this, at least not without a flag, currently ppl is tping accounts over the place and it works for the encounter, adding a gmo will take more time to do the encounter and will not give any interesting info.

Copy link
Copy Markdown
Contributor Author

@neskk neskk Jan 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last night @Alderon86 explained this to me... I thought it would be a red flag for accounts to encounter mons without even doing a GMO before. I would have thought that it was a blatant misbehavior that would be quickly picked up by the game.
I will revert this change or at least add an option to enable it by choice.

Comment thread runserver.py
search_thread.start()
else:
# Check if we are able to solve captchas.
if args.captcha_solving and not args.hash_key:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OS can do captcha solving now?

Copy link
Copy Markdown
Contributor Author

@neskk neskk Jan 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hummm, this is still a WIP. I'm thinking that it should be possible to have an --only-server dedicated to captcha solving that would pick up captcha'd accounts from the database and would just listen for tokens (manual captcha solving). Right now this check makes little sense because I disabled AccountManager when launching in --only-server.
I think that having a flag that tells AccountManager to immediately release captcha'd accounts (if automatic captcha is not enabled) and then allow -os instances that have --captcha-solving enabled to pick up these accounts into their captcha queue. Then, we could just have a single instance querying the database for tokens and that its --hostname could be set as the --manual-captcha-domain. This only makes sense in multi instance setups but it's a bit hard to provide so many configuration/behavioral options in a single software.

Comment thread pogom/accountManager.py

self.args = args
self.dbq = db_queue
self.whq = wh_queue
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems it is not used

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is used often to push updates to the Account model and to be available to captcha solvers.
@friscoMad I've seen your PR #2422 and I was thinking that maybe Account manager could hold the key_scheduler instead of being defined as a global variable in apiRequests. I was a bit tired of having to pass db/wh update queue to every function. args is also for convenience, there's some spots where I managed to avoid passing the variable to the function by having this local reference in the AccountManager object. This way, if you pass the account_manager around you will have everything you need. Originally I had put AccountManager in models.py and took advantage of having args declared as a global variable. When I moved it outside of models.py - because it's not really a database model, AM isn't saved in the database - I thought if I could avoid having a local reference to args and it's tremendously useful to have it, making the function calls a bit more concise and readable.

Comment thread pogom/accountManager.py
class AccountManager(object):
def __init__(self, args, db_queue, wh_queue, high_level=30):

self.args = args
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid using args, expand the needed parameters

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I've explained in the above comment, I don't feel it's that bad to save a reference... This probably has low impact on memory/performance and it's very useful to have it available. There are too many variables that are accessed from args, maybe if, down the road, the scope of args usage inside AccountManager is reduced to a half-a-dozen variables I would expand them all. For the time being, and if I'm not wrong, accessing a pointer to a pointer isn't an avoidable code pattern and does not compromise performance.

Comment thread pogom/accountManager.py Outdated
cycle = 0
time.sleep(60)
while True:
# Run once every 15 seconds.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will not run every 15 secs, it will run every 11m 15 sec if the cycle thing is ok

Copy link
Copy Markdown
Contributor Author

@neskk neskk Jan 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum... you probably didn't read with enough attention, this is the same logic I've used in #2383, you've probably misunderstood how it works.
Sleep timer is always 15 seconds, some times it only runs the first bit (hence the "Run once every 15sec"), every 4 cycles it calls some more functions and then sleep 15 secs (hence the "Run once every minute" - 4 x 15 secs) and, finally, every 40 cycles ( 40 x 15 secs = 600 /60 = 10min) it will call account monitoring routine.
This allows a single thread to run different routines in different intervals. One of the reasons I've brought captcha code into the AccountManager was to avoid having an extra thread just to monitor the captcha queue.
Account Manager thread runs the account recycler, the captcha overseer and the account monitor routines. Perhaps I went too far in premature optimization...

Copy link
Copy Markdown
Contributor

@friscoMad friscoMad Jan 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, my bad it works but it is a bit convoluted, use something like this, it is a lot clearer:

counter = 0
While true:
  counter += 1
  captcha_manager()

  if counter % 4 == 0:
    account_keeper()
    account_recycler()

  if counter % 40 == 0:
    monitor()
    counter = 0

  time.sleep(15)

@friscoMad
Copy link
Copy Markdown
Contributor

friscoMad commented Jan 10, 2018

Some thoughts as I haven't done a proper review:

  • Workerstatus should be merged with accounts or at least deduplicate data between the 2
  • Captcha tokens should be stored in account table
  • I prefer to keep Captcha handling in a single module, make it a service as the account manager and make the manager talk with the captcha service but I will keep the account manager to just manage accounts.

@neskk
Copy link
Copy Markdown
Contributor Author

neskk commented Jan 10, 2018

@friscoMad WorkerStatus and MainWorker could be reworked yes, but then the scope of this PR would increase greatly and we would have to rework the interface. I'm not against this, I'm just saying that it will affect even more things all at once and would require even more work. Since WorkerStatus has some duplicate information about Account we could fetch that information from the Account table. But there's a tradeoff... there's one more "foreign key" and the query for WorkerStatus becomes more complex. WorkerStatus table records never grow too much, they're cleaned once they reach 30min without being updated and Accounts aren't updated nearly as often they do. I decided to keep them as they are for the time being, until we really decide how we're handling this.
I can see Captcha as being a separate module but since AccountManager had all the required "things" around I went for the easiest route. Maybe not it's easier to refactor it back to a separate module that has a reference to AccountManager to interact with its queues.
Finally, Captcha Tokens being saved in Account table makes little sense for me, it would be a VARCHAR row that would be null all over. I know that nulls don't occupy space but I don't see why this a good reason to do this.
Thank you @friscoMad for taking time to look at this PR, I always appreciate constructive critics and feedback 👍

@FrostTheFox
Copy link
Copy Markdown
Member

I know some of this has been stated over Discord, but I want to make it clear going forward.

You have already been removed from the #pr channel in Discord twice for issues with our processes, decisions and methods of collaboration. I previously told you (#1819) to remain professional on GitHub or you will be blocked from the repository. You had not said anything else unprofessional on GitHub, so I did not have any problem allowing you to open PRs and work on the GitHub repository.

Fast forward to now, the plan was for #2376 to be merged first, as it had a tested and proven database model, and one that we had been discussing inside #pr for a little while. You would rebase any work on behavior off of said PR, that way, we have the database model in place, solidified and tested, and we can build the behavior around it, and test it separately. You were told this by Seb.

Instead, what was done, was this PR was opened in direct competition of #2376, completely disregarding the original plan. This is completely against what we aim to do here at RM, not only in terms of behavior on the repository, but our testing preferences as well. You recommended skipping #2376 because "their starting points are pretty much the same and I got account rotation to work correctly using the database. I'm not saying PR #2424 is absolutely ready and flawless but I think it's already more stable and reliable than this one, for instance". You previously mentioned that you had talked with Alderon about reworking some old code from Arengo. All this begs the question, why the hell, would you decide to open a PR in competition with #2376 instead of collaborating with Alderon on it? If they really are of the same starting points, it should have been no problem?

In no particular order, a few replies:

"Nice way of saying thanks to me for my work because it seems really not important to you guys."
This is the type of attitude that we don't tolerate in RM. Contributions are highly appreciated, but when they're done in a way that causes drama and trouble for the rest of the team, and then complaining we don't appreciate your work when we condone the drama is highly counter productive and doesn't make us very happy. It often takes time where we could be working on actual code.

"Plus, collaboration doesn't not have it's credits properly accounted for as I told you last week when you merge spawn-spawn rework which had 90% of the code done and debugged by me."
We don't work on this project to have our names on the front of the commits page. Issues with credit for code should be taken up with GitHub. It's not our fault that the opener of the PR is the name displayed. Also, we have our team of reviewers and devs, all of whom are tasked with reviewing PRs and approving them. Their approvals are what determines the mergability of PRs, does their names get put on every PR that they approve? Nope.

"Sometimes I wonder why I keep spending my time on this project since all you do @sebastienvercammen is hold a grudge towards me... but it's fine, I forgive you :)"
I originally was the one who in my discussions with Seb was partially responsible for giving you the 2nd chance in PR, I originally talked to you about the issue in Discord and tried to level with you. I was the one who warned you to stop taunting @sebastienvercammen on the GitHub. This behavior of calling someone specific on "having a grudge" is an attempt at deflecting blame and accountability for behaviors of an individual. I gladly accept criticism where it is due, and I often receive it outside and inside of the context of RM, and I don't think I have ever not noted the criticism. Even if I felt the person was just trying to troll me or something. Constructive criticism is valid regardless of the context and subject, whether it be on code or behavior. My point here is, that, because someone is telling you that your attitude is not going to be tolerated, and then you don't change that attitude, doesn't make them have a "grudge" against you, especially when their decisions are made with in coordination and with approval of other members of the team.

You claim that you talked to Alderon for a bit, and it "immediately became clear that it would be easier just to go to 2424 and I think Alderon realized that as well". If this were the case, why would Alderon have not come to us and expressed his desire to continue with 2424 and abandon 2376? (I'm not blaming Alderon here, I'm just saying if this were truly the case and you were in collaboration with Alerdon, then that is what should have happened).

Finally, you seem to be confused as to why we are preferring to separate things in the first place. It's a very simple guideline and it's one we've tried to follow for a while now. When you stare at a 500 line change PR, which say adds two new features. It's a large blob of changes to look at and test, especially when you mix multiple features in with it. But, if you split it into a couple PRs, each for the specific feature, it means that each PR's code can be reviewed separately, and then tested separately. It keeps the procedure more smooth when we have our review team, as we can simply hand them a list of PRs (which, in a perfect world, would each be a single feature), and if there's an issue with feature A, they can just say "PR #x has x issue", and it can be commented on that PR, meanwhile, feature B (PR 2) might be perfectly fine and can be already merged into the main repository while A (PR 1) is being patched. Where as, one PR for both A and B, means reviewers need to say "Feature A in PR #x is broke", and that delays the entirety of the PR until the issue is fixed. This is not a super unique standpoint held in RM, it's standard project management.

And finally, the driving point behind all of this wall of text, is that you have received multiple 2nd chances, and warnings to change your behavior. Apparently, they are not being taken as they should be. Therefore, I am here to extend to you your final warning. This warning is NOT from Seb, this warning is not from anyone else on the RM team. This warning is from myself. Stop causing drama, stop causing issues, stop exhibiting anti-collaborative behavior, or I will not hesitate to block you from collaborating on the repository. We normally allow anyone to collaborate, but continued incidents is going to result in blocks.

This is your choice. If you want to continue to help and collaborate with us, change your attitude, and work with the other members, not against them. I'm not going to allow more of the time of our team to be wasted on any issues like this, so should they arise in the future, immediate action will be taken.

@neskk
Copy link
Copy Markdown
Contributor Author

neskk commented Jan 15, 2018

@FrostTheFox

  • I didn't put my PR here in competition with Accounts in Database #2376. Everyone is spinning this as me trying to overshadow that work when it's not true.
  • I certainly didn't want any of this drama and was unaware of it since I'm out of PR.
  • I understand that it's best to split functionality between PRs but some stuff makes sense together.
  • I'm not doing this for bragging rights.
  • I also don't want to waste more time with this discussion, I didn't start it, I have explained my reasons for posting a new PR and if I went against any rule I would accept any decision you guys made and moved forward immediately.
  • Everyone jumps the gun to attack me, sometimes without reason but ok... I'll get over it.

@FrostTheFox
Copy link
Copy Markdown
Member

FrostTheFox commented Jan 15, 2018

  • I use the word competition because the PRs overlap in scope.
  • Which is great but it doesn't discount that it happened.
  • Accounts in database and an account manager are two different things which should be separate, imo. If you don't feel that way, well, that's just how you feel I guess? I'll admit I haven't looked at the code intricately, but I'm sure that accounts in the database as basic functionality is certainly within reach without a lot of account manager work. Anyway, that's not the real point, the point isn't necessarily about the splitting, it's about us making our plan known, and having decided that's what we want to do and then it being subsequently ignored.
  • So why bring it up in the first place?
  • I don't either, so this is going to be my last reply.

My reply was to explain why, as of now, I feel it is necessary to give you a final warning. I could have just left you the the last sentence, but I prefer to give context. Therefore, I am done here and any further comments will be related to code.

@neskk
Copy link
Copy Markdown
Contributor Author

neskk commented Jan 15, 2018

Perhaps if everyone looked more at the code this whole discussion would be avoided.
#2376 was far from "well tested" and fully functional, those claims are untrue. People have the right to disagree with me, I said that it made much more sense to put the account manager instead of some hacked functions glued together just to preserve old structures.
I did bring up the PR credits because, this time (unlike spawn_scan), I would like my efforts to be properly accounted for. I know it's an issue with GitHub, I would have never brought it up in public, I was talking privately with Seb and he published those messages out of context: last week I told him that Contributors tab didn't track the work I did in spawn_scan. Still... I would never refuse to collaborate because of this and, at this point, I really don't care if you believe me or not.
My branch initially moved passed the point where #2376 was very fast, and as far as I know it's not a crime not wanting to fix somebody else's PR after I had already done that work. I haven't got much time now, I'm sickly tired of this subject, I didn't ask for this drama, I simply submitted a PR, explained the differences, similarities and reason for doing so. If it was so wrong, you guys had the power to close it, but you can't force me to do something and then tell everyone that I'm not being "collaborative" because it's easy to say that while in your shoes.

Comment thread pogom/utils.py
return device_info


def generate_instance_id(args):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe using process id breaks builds in systemd.
Restarting rocketmap.service doesn't change it and I got stuck in "duplicate instance detected" until I manually cleared

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That message tells you that an instance with that same instance_id was running 30 seconds ago.

  1. If you're restarting the instance, allow 30 seconds to cool-off between shutdown and startup and you won't get that message.
  2. Maybe they are indeed conflicting instances in which case you shouldn't be manually clearing them up. generate_instance_id should return a different instance_id for each one of your running instances. It hashes location, worker count, status name and several other configuration parameters to make sure we get an unique value.

Either you didn't wait the 30 secs between shutdown > launch or you're trying to launch an instance in the same location, with same step size, worker count and scanning parameters.
Status name is ignored in generate_instance_id if it's not set. Only after generating instance_id will it be updated to match the process ID. This is done so that you don't get a different instance_id each time you launch your instance without a defined status-name.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tyvm

to anyone running this on systemd with
restart=always, should now add a:
RestartSec=30

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adjusted this cool-down timer to 10 secs.

@neskk
Copy link
Copy Markdown
Contributor Author

neskk commented Jan 18, 2018

I'm re-writing account allocation, current method is imperfect and I think this will solve some weird issues.
Should be pushed in a couple of hours.

Comment thread pogom/account.py


# Keep track of different types of bans using a single field.
class AccountBanned:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use Enum for this

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like Enum is a bit of an overkill for this, I've thought about this and feel it's unnecessary for the time being, maybe later if we have a more generic way of managing bans we change this.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Enum doesn't "let" me do things like if account['banned'] <= AccountBanned.Shadowban which are useful. I tried to change the code and all the database queries would have to use AccountBanned.Clear.value which is also annoying. For the time being and, as I discussed with Seb some time ago, I think I will let this stay a simple class which is more flexible than Enum.

Comment thread pogom/account.py Outdated

def spin_pokestop(api, account, args, fort, step_location):
def spin_pokestop(args, account_manager, status, api, account, fort):
if not can_spin(account, args.account_max_spins):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Args is not used

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed ;)

@friscoMad
Copy link
Copy Markdown
Contributor

I still have not done a full review but I have some ideas:
For simplicity I would try to aggregate all in the account instance (status, parent_manager, api, proxy...) we are just passing everything around together so it makes sense to keep it together, all is account information.
Account Manager could be just a module or a singleton that you could import and use anywhere but I am not sure what's the best way to do that in the most pythonic way.

I see you did a lot of work in captcha handling, still captcha should be handled in apiRequest.py as it can appear at any time (captchaed accounts get the message during the login process for example) and after a captcha all requests should be retried (if the captcha was solved), that may need more changes over your changes, I am not saying that you should add that right now in the PR but take into account if possible to make the evolution as easy as possible.

@neskk
Copy link
Copy Markdown
Contributor Author

neskk commented Jan 20, 2018

@friscoMad I totally agree with you in what you said, I tried to leave room for changes in the future and I think account manager can accommodate what you're thinking.

  1. AccountManger - Singleton that handles all account data, it can be done I avoided putting more things inside for now so that this PR doesn't grow even more. I like this idea and may execute it in the future.
  2. Captcha rework - I tried to create a generic method to handle captchas... I agree with you, captcha detection should go inside apiRequests but then apiRequests would have to "know" about AccountManager. This way I keep changes to apiRequests to a minimum. I'm happy with current solution, it works fine but it can still be improved.

@billyjbryant
Copy link
Copy Markdown
Contributor

This is going to need to be rebased soon as #2533 #2528 and #2537 will all change the db_schema_version

neskk added 29 commits March 17, 2018 05:45
Starting point, removed shadow-banned related things and minimized deviations from develop branch.
Need to create an account manager - using AccountSet as inspiration.
Review how accounts are loaded into the database - query all, filter input files (if any) and only insert unique - set username as unique in Account model?
Many annotations, we need to address a lot of stuff to make this pretty.

Would be cool if some one started working on front-end page.
Merge branch 'F-HealthyAccountDB' of https://github.com/Arengo/RocketMap into pr-accounts-db

Rewrote accounts-csv parser.

Adjusted Account database model to make things more streamlined.
Accounts-csv now supports a syntax that has an extra column with account level - removed hlvl-accounts.
Instead of two separate functions to check new accounts and insert them, reworked the code to do both operations in one function (Account.insert_new).
Worker count is now required since this defines how many accounts we'll be using.
encounter_pokemon accounts can now solve captcha on the spot.
spin_pokestop uses location from account, no need to pass one more argument.
Removed 'shadowban' field in favor of having a 'banned' field with multiple status.
AccountBanned: 0 (Clear), 1 (Shadowban), 2 (Tempban) and 3 (Permban)
Improved account monitoring/cleanup routine.

Fixed high-level encounter logic, all situations are now correctly handled by the account manager, in particular:
 - When accounts are wrongly inserted as high lvl, they now fail and after resting they are returned to the appropriate queue.
 - Account information wasn't being updated upon account releasing.

Added altitude in some places we were missing it. I don't think it's worth keeping this field in Account database model.
Reverted some odd changes in schedulers.py
parse_accounts_csv will now filter duplicate accounts in files and only declares default account once.
If args.hlvl_workers is set to "0" but encounters are enabled, the Account Manager won't keep the accounts allocated.
This means that the queue of high-level accounts won't hold the account after it is used.
If args.hlvl_kph is set to "0", the speed limit won't be checked and we can maximize the number of encounters an account can do.
Added parameter "--encounter-retries" that controls the maximum number of attempts that an account will retry an encounter.
Some methods inside Account Manager should never be used from "outside" the class - making them "private".
Bug fix on AccountManager::run_manager()
Logging in account manager fixed, no more "duplicate" messages - old debug code.
Account manager thread startup moved outside of class init function. This fixes a issue when starting the scanner without any accounts already on the database, it would only fetch the inserted (new) accounts after a minute (first loop of account manager).
Refactored a bit of code in _allocate_accounts to avoid duplication.
Fixed identation of all actions that require an account manager - runserver.py
Small bugfixes.
Changed type of Account fields from 'CharField' to 'Utf8mb4CharField'.
Added 'instance_id' to MainWorker and WorkerStatus
A bit of clean up on worker status updates.
Added a check at startup to avoid running duplicate instances.
Fixed some bugs in account.py - reset_account()
When --hlvl-workers is set "0" it will try to allocate accounts on-the-fly. There was a bug that caused the instance to allocate 5 accounts because it wouldn't break on success.
Overseer stats message now includes more information about the account pool.
Status page "fixed": uses instance_id instead of worker_name to compute MainWorker hash.
WorkerStatus model updated to include "norares" count.
Status page and status printer thread displays the number of scans without rare pokemon (no-rares).
Instance ID generation now takes into account "no-gyms", "no-pokemon", "no-pokestops" and "no-raids" flags - This will help avoid conflicting instance IDs.
Old captcha logic was a bit fuzzy and there was a lot of duplicate code. I think I'm happy with this solution, still needs proper testing.
Reworked logic to unallocate high-level accounts after usage.
"--hlvl-workers-holding-time" is used when "--hlvl-workers" is set to 0 (on-the-fly allocation).
Encounter retry cycles fixed: we don't want to wait for nothing.
Database schema version updated and added schema "migration".
Re-organization of how account manager keeps track of allocated, active and spare accounts.
AccountRecycler: after failing and waiting its holding time in failed queue, banned accounts are released and deallocated from the instance.
AccountKeeper: release excess accounts that aren't being used (-asi << -ari).
Cleaned up handle_captcha() code to be more understandable.
Added separate locks for each account pool and rewrote functions to move accounts between pools - performance improvement.
Moved captcha code block to the end of the file - keeping accountManager.py organized.
Reduced timeout to 10 secs on startup duplicate instance launch check. Log message updated to avoid doubts. MainWorker should be updated every 3 secs, so I lowered the cooldown period from 30 secs 10 secs which should be enough.
AccountMananger internal methods now all start with '_', e.g. _account_keeper(). Methods that are supposed to be called from outside don't have this '_' prefix.
Tested encounters, with hlvl_kph set to 0, one account can perform several encounters in seconds.
Captcha solver now uses account.py setup_api() - tested.
Many, many, too many changes...
Rebased.
handle_captcha now handles "False" responses from failed requests.
Still have some issues with account allocation in a setup with multiple concurrent instances.
Added a double check that validates which accounts were successfully allocated in case something went wrong with update query.
Removed counters "replenish_count", I used them previously to replace active pool but in the end I figured it's best to keep an OrderedDict with the active account pool.
Refactored and moved all account allocation code to to _allocate_accounts and _fetch_accounts.
Reduced the time an account has to wait before retrying to get a spare account. Maybe this can even be lowered further to allow lower excess account holding times.
Turns out account allocation was fine and a bug in WorkerStatus was giving inconsistent results on status page.
WorkerStatus acts as temporary record keeper for account stats: no rares and captcha count). Other than that there's no more useful information to fetch from previous status since the remaining information (latitude, longitude, last_scan, last_modified) now comes from Account data.
Renamed WorkerStatus field 'last_scan_date' to 'last_scan' to match Account model. Updated and made adjustments where 'last_scan_date' was being used (schedulers.py).
WorkerStatus a.k.a. 'status' field 'last_scan' can now be set to 'None'. This change will make it easier in the future to merge/remove duplicate information from WorkerStatus and use latitude, longitude, last_scan, directly from Account model.
Small ajustements in Speed-Scan and Spawnpoint-Scan to accomodate 'last_scan' changes - I think code is more readable now and everything works the same way.
Renamed WorkerStatus field 'norares' to 'no_rares' to match with 'no_items'. Updated all references of this field.
Created a method that returns the default WorkerStatus dictionary with its stats reset - reduced duplicate code and separates local/runtime data from persistent/database data.
Reordered WorkerStatus fields: Skips > No Items > No Rares > Captchas
Status page will now only display WorkerStatus/MainStatus that were active in the last 30 minutes. Improves consistency of status page for those who don't enable database cleanup.
Account stats information displayed in manual captcha solving page will also only include stats from MainWorker that have been recently active.
Small improvements in some logging messages and formatting in search.py.
Since accounts are shared between instances we needed a way to keep track of the number of failed login attempts.
The solution we had was local/runtime only and wouldn't work for "on-the-fly" account allocation.
Renamed Account model field 'fail' to 'failed' and changed its type to SmallInt in order to have a persistent counter of login failures. 'fail' field was only used to simbolize account was on failed account pool, it had almost no purpose, now it's an important field.
Added '--account-max-failures' to control the number of consecutive login attempts failed it takes to flag the account as perm banned. Default is 5, meaning the account will have to fail N login-retries, on 5 different occasions to be considered banned. Once it fails the first time account will be put to sleep for a small period of time and will attempt to login again later.
Reverted some changes in '--hlvl-workers-max', it is only used when '--hlvl-workers' is 0 (on-the-fly allocation).
Improved overall robustness of critical code blocks and database operations.
Added back some safety checks to avoid having the same account being used by multiple search workers.
Added database().execution_context() to _allocate_accounts() to make sure account allocation is made within a transaction.
Improved thread safety locks, seemed like Account Manager was losing track of some accounts.
Added a try..except to _account_keeper() to pick up critical errors in account management.
No longer uses a cycle counter, instead using timers to allow more flexibility and to clarify code's logic.
Changed check_login() to output more info and handle exceptions. This function now returns a boolean accordingly to login success.
Removed exceptions no longer used (TooManyLoginAttempts and LoginSequence). All information is still kept for debug.
Login failures now expect a timeout afterwards, this is used to prevent permaturely marking all accounts as perma banned just because something was wrong for some hours (e.g. PoGo login servers offline).
Fixed duplicate 'only-server' test in runserver.py.
Account model now holds the methods that insert and clear accounts in database.
Small changes/additions to code comments.
Save remote configuration assets and templates times in the database (Account model).
Only perform assets requests if database times don't match current remote configuration.
This change should save a lot of RPM and make login times shorter.

Update database schema:
```
ALTER TABLE `<database>`.`account`
ADD COLUMN `time_assets` INT(4) UNSIGNED NOT NULL DEFAULT 0 AFTER `level`,
ADD COLUMN `time_templates` INT(4) UNSIGNED NOT NULL DEFAULT 0 AFTER `time_assets`;
```
This fix the situation where newly inserted accounts would be picked up instead of older ones due to having a more recent last_modified value.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants