Skip to content

[WIP] Refactor upsert, repsert, upsertBy and create repsertBy#785

Closed
pythonissam wants to merge 11 commits intoyesodweb:masterfrom
pythonissam:upsert-repsert-and-friends
Closed

[WIP] Refactor upsert, repsert, upsertBy and create repsertBy#785
pythonissam wants to merge 11 commits intoyesodweb:masterfrom
pythonissam:upsert-repsert-and-friends

Conversation

@pythonissam
Copy link

@pythonissam pythonissam commented Feb 19, 2018

My TODO:

  • Fix haddock
  • Add examples to haddock
  • Implement repsertUnique
  • Implement repsertUniqueBy
  • Implement repsertUniqueMany
  • Implement repsertUniqueManyBy

Before submitting your PR, check that you've:

After submitting your PR:

  • Update the Changelog.md file with a link to your PR
  • Check that CI passes (or if it fails, for reasons unrelated to your change, like CI timeouts)

@pythonissam
Copy link
Author

issue ref: #782

@pythonissam pythonissam changed the title [WIP] Refactor upsert, repsert, upsertBy and repsertBy [WIP] Refactor upsert, repsert, upsertBy and create repsertBy Feb 26, 2018
@chreekat
Copy link
Contributor

I would suggest that the amount of description you have added here is better suited to a blog post, rather than to reference material like Haddocks. In particular, duplicating the boilerplate Person definitions for every method is distracting. (It's also unworkable: mkPersist will create a Person data type.)

@chreekat
Copy link
Contributor

I think your understanding of upsert is not yet complete. :)

The upsert methods are only usable on entities that have specified uniqueness constraints. upsert works on entities with a single constraint. upsertBy allows you to pick which constraint to use, for entities with more than one constraint.

Entities with uniqueness constraints might cause insert to fail if the constraint(s) would be violated. The upsert functions let you explain how you want those conflicts resolved. They say, "Update the existing record instead, using these update statements". If there is no conflict, the new record just gets inserted directly.

Repsert, on the other hand....

Looking at the documentation, repsert is not what I thought it was. It looks like it must have been based on Mongo's idea of repsert. I had assumed it was a member of PersistUniqueWrite, just like upsert. In my universe, repsert has type

-- THIS DEFINITION DOES NOT EXIST IN PERSISTENT
repsert
    :: (MonadIO m, PersistRecordBackend record backend)
    => record
    -> ReaderT backend m (Entity record)

And it replaces on conflict, rather that updates on conflict. This is what MySQL's replace command does.

I guess I should add a method repsertUnique if I really need it :)

Hope that helps.

@MaxGabriel
Copy link
Member

I'm definitely in favor of keeping the example documentation. However, it would be better to move the duplicate parts to the top of Database.Persist.Class.hs, as an example dataset that's meant to be referenced by other functions. This would match how the Persistent combinators (e.g. +=.) are documented.

There are also some slight modifications we could make to make the example documentation a little shorter

@pythonissam
Copy link
Author

@chreekat @MaxGabriel
I agree with Max. Related functions are little bit confusing and I immediately forget which is which. I think examples like this would help newbies to understand their usages.

However, As both of you said, added examples seem to have room for improvement...

@pythonissam pythonissam force-pushed the upsert-repsert-and-friends branch from 8a5a4fb to ff0db5c Compare March 5, 2018 10:12
@pythonissam
Copy link
Author

@chreekat @MaxGabriel I've updated and fixed the examples of upsert-related functions. Would you take a look at them?

pythonissam@d37b134

@chreekat Also, I've implemented repsertUniqueBy for now. What do you think about this?

pythonissam@ff0db5c

@chreekat
Copy link
Contributor

chreekat commented Mar 5, 2018

Also, I've implemented repsertUniqueBy for now. What do you think about this?

I'll leave it to your judgment.

-- We try to explain 'upsertBy' using the schema 2.
--
-- @
-- upsertBy (UniqueUserName \"SPJ\") (Person \"X\" 999) [PersonAge +=. 15]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you'd like, you can use the "bird track" haddock code syntax, so you don't have to escape things:

> upsertBy (UniqueUserName "SPJ") (Person "X" 999) [PersonAge +=. 15]

The main advantage to the @ syntax is that you can create hyperlinks to identifiers using single quotes, but if you aren't doing that then it's noisier to use @

@MaxGabriel MaxGabriel mentioned this pull request Mar 8, 2018
@MaxGabriel
Copy link
Member

repsertUniqueBy looks good and I think the docs are much better!

@pythonissam
Copy link
Author

@MaxGabriel Thank you!! and I'm gonna fix haddock later as you suggested:)

@pythonissam
Copy link
Author

pythonissam commented Mar 11, 2018

@naushadh I've implemented repsertUniqueMany and repsertUniqueManyBy. However, I got confused after reading the document of repsertMany. It reads:

Differs from insertEntityMany by gracefully skipping pre-existing records matching key(s). @SInCE 2.8.1

Is this alright? Considering its name, I think it should map repsert over the records and actually, its default definition is as follows:

repsertMany
        :: (MonadIO m, PersistRecordBackend record backend)
        => [(Key record, record)] -> ReaderT backend m ()
    repsertMany = mapM_ (uncurry repsert)

but in the SqlBackend instance declaration for PersistentStoreWrite, it reads:

    repsertMany krs = do
        let es = (uncurry Entity) `fmap` krs
        let ks = entityKey `fmap` es
        let mEs = Map.fromList $ zip ks es
        mRsExisting <- getMany ks
        let mEsNew = Map.difference mEs mRsExisting
        let esNew = snd `fmap` Map.toList mEsNew
        insertEntityMany esNew

ref: https://github.com/naushadh/persistent/blob/40acff65772a01916b08a5f56f60f9e540792f93/persistent/Database/Persist/Sql/Orphan/PersistStore.hs#L275-L282

@pythonissam
Copy link
Author

Another question. upsert returns the record in the database after the operation, but repsert doesn't:

upsert
        :: (MonadIO m, PersistRecordBackend record backend)
        => record          -- ^ new record to insert
        -> [Update record]  -- ^ updates to perform if the record already exists (leaving
                            -- this empty is the equivalent of performing a 'repsert' on a
                            -- unique key)
        -> ReaderT backend m (Entity record) -- ^ the record in the database after the operation

repsert :: (MonadIO m, PersistRecordBackend record backend)
        => Key record -> record -> ReaderT backend m ()

Is there any reason for leaving it like this? I think it'll be more convenient if repsert also returns the modified record.

@pythonissam
Copy link
Author

@MaxGabriel Yet the implementation isn't finished, I guess my work is mostly done. For now, would you review my commits and correct my English accordingly?

-- | Replace based on a uniqueness constraint or insert:
--
-- * insert the new record if it does not exist;
-- * If the record exists (matched via it's uniqueness constraint), then replace the existing record with the given record.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's -> its

-- Batch version of 'repsertUnique' for SQL backends.
--
-- Useful when migrating data from one entity to another
-- and want to preserve ids.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Useful when migrating data from one entity to another and want to preserve ids.

This should have a subject for the verb want, something like:

Useful when you're migrating from one entity to another and want to preserve IDs.

@MaxGabriel
Copy link
Member

@pythonissam Ok what do you think about this: we don't add repsertUnique or repsertUniqueMany, and we deprecate upsert (see #796). It seems like it's just setting people up for trouble that they can add a new unique constraint to their Persistent type, and suddenly get runtime exceptions in their code.

@naushadh
Copy link
Contributor

#785 (comment)

@pythonissam: repsertMany is implemented that way in SqlBackend for performance reasons. See pull#770 for notes/benchmark.

@pythonissam
Copy link
Author

@MaxGabriel No problem with that, except repsert. Do we leave it as it is? I feel unconfortable the presence of it even though upsert is deprecated.

@pythonissam
Copy link
Author

@naushadh Hmm... I understand, but it isn't the batch version of repsert, is it? I suggest that we change the name of it to clarify what it actually does.

@MaxGabriel
Copy link
Member

@pythonissam What's wrong with repsert that it should be changed?

@pythonissam
Copy link
Author

@MaxGabriel It's a matter of correspondence:

upsertrepsert
upsertByrepsertUniqueBy

It's like "Why is there repsert even though there's no upsert?"
However, it may just be my personal preference. If it's ok with you, then I'm ok too:)

@pythonissam
Copy link
Author

@MaxGabriel So we won't add any new functions for now, right?

@pythonissam
Copy link
Author

ping @MaxGabriel @naushadh

1 similar comment
@pythonissam
Copy link
Author

ping @MaxGabriel @naushadh

@ncaq
Copy link
Contributor

ncaq commented Apr 10, 2018

I was surprised to see this pull request after making a pull request #804 .
Our application already has repsertBy as a function of the Util module and it is running.
I made a pull request trying to reduce it to the upstream.
I did not notice the existence of this pull request during work.

I have no malice.

However, apparently, in this pull request, it seems that repsertBy has not been created yet.
So, please do not make repsertBy in this pull request, how is it to divide labor?

@pythonissam
Copy link
Author

@ncaq Never mind. However, actually, I did it. Please look at this commit.

Let me explain my thoughts.

Firstly,

-- https://github.com/yesodweb/persistent/blob/master/persistent/Database/Persist/Class/PersistUnique.hs#L99-L112
upsertBy :: Unique record -> record -> [Update record] -> ReaderT backend m (Entity record)

gets Unique record and this is what By means. So precisely, your repsertBy isn't repsertBy.

Secondly,

-- https://github.com/yesodweb/persistent/blob/master/persistent/Database/Persist/Class/PersistUnique.hs#L79-L94
upsert :: record -> [Update record] -> ReaderT backend m (Entity record)
-- https://github.com/yesodweb/persistent/blob/master/persistent/Database/Persist/Class/PersistStore.hs#L204-L208
repsert :: Key record -> record -> ReaderT backend m ()

for some reason, repsert takes Key record as its first argument! So, I decided to separate repsertUnique:

-- https://github.com/yesodweb/persistent/pull/785/commits/67534d0d4f93792bb3209ac28c8ac4729661c927
repsertUnique :: record -> ReaderT backend m ()

and implemented its By version:

repsertUniqueBy :: Unique record -> record -> ReaderT backend m ()

@pythonissam
Copy link
Author

pythonissam commented Apr 14, 2018

@MaxGabriel @snoyberg @ncaq

To sum up the discussion:

If we re-implement repsert* functions

  • The difference between upsert and repsert should only be update or replace records. However, upsert takes record while repsert takes Key record
  • upsert returns modified records, but repsert doesn't
  • Add repsertUnique and repsertUniqueBy (upsert obtains uniqueKey from record using onlyUnique. These names reflect that. They may be too long, though)
  • The implementation of repsertUniqueMany and repsertUniqueManyBy

If we don't

  • @MaxGabriel thinks deprecating upsert is enough. I think this also works.

What should I do to merge this PR? It would be helpful if you show me a rough guideline.

@ncaq
Copy link
Contributor

ncaq commented Apr 16, 2018

@pythonissam
Do you think about that type of insertBy?

insertBy :: (MonadIO m, PersistUniqueWrite backend, PersistRecordBackend record backend) => record -> ReaderT backend m (Either (Entity record) (Key record)) 

Database.Persist.Class

I do not think that by is unique record argument.
I think that by is to use unique type.

@pythonissam
Copy link
Author

pythonissam commented Apr 22, 2018

@ncaq
Sorry. I don't understand... What does "to use unique type" exactly mean?

Furthermore, I guess insertBy also has a bug... what if there're more than two uniqueness constraints? It must throw an exception or something like that.

@ncaq
Copy link
Contributor

ncaq commented Apr 24, 2018

@pythonissam

Sorry. I don't understand... What does "to use unique type" exactly mean?

unique type mean is that type of arguments have PersistUniqueWrite.

Furthermore, I guess insertBy also has a bug... what if there're more than two uniqueness constraints? It must throw an exception or something like that.

I forgot two uniqueness pattern.
However, The current impl getByValue process all unique pattern recursive.
If match some uniqueness constaints, then insertBy return Left instead exception, else insert.
So I think no problem insertBy.
Left returning value is bit ambiguity, however, impontant is Left, value not matter.

case repsertBy, I assure you.
When two uniqueness exist and repsertBy one argument, prefer first uniqueness implicit.
Deprecate upsert, So I must escape implicit select.
So, I closed #804 .
Thank @pythonissam .

@pythonissam
Copy link
Author

@MaxGabriel @snoyberg

Since There's been no progress recently, I'm thinking of closing this PR... How do you think?

@snoyberg
Copy link
Member

I haven't worked with these functions myself, so I'm not qualified to give a useful opinion, sorry.

@bitemyapp
Copy link
Contributor

bitemyapp commented Jun 17, 2018

@pythonissam I had to root around a bit to answer the question for myself, but I don't think repsert/upsert (the non-By variants) are a good idea.

Then addressing repsert specifically, it was not obvious from the name that repsert was an upsert that doesn't return the record. The idiom for this in Haskell is to make actions that return () to be suffixed with _. An example of this is traverse and traverse_. Given that, if I had my druthers the only remaining functions would be upsertBy and upsertBy_.

I apologize this wasn't brought to some sort of conclusion sooner and I hope you'll continue kicking Persistent around.

I'll close this PR as not accepted, if anyone feels strongly about this issue/feature please speak up.

@bitemyapp bitemyapp closed this Jun 17, 2018
@pythonissam
Copy link
Author

pythonissam commented Jun 23, 2018

@bitemyapp
Okay, though I'm not fully convinced as I stated in #796 . Anyway, thank you for the response:)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants