Mutable constructor fields #8

simonmar · 2016-09-20T02:43:38Z

This is a proposal to add mutable constructor fields to GHC.

simonmar · 2016-09-20T02:47:28Z

CC @simonpj @rrnewton @fryguybob @niteria

Ericson2314 · 2016-09-20T04:08:51Z

Do the constructors need to be magic like that? I rather have a magic way to construct a Ref# so all exposed variants are non-magical.

Ericson2314 · 2016-09-20T04:16:10Z

Perhaps we constrain the RuntimeReps in -> so whatever magic constructs Ref# is always inlinable / not exists on its own post compilation?

ramonkeypr · 2016-09-20T06:14:56Z

proposals/0000-mutable-fields.rst

+
+The GC write barrier for a mutable constructor may be a little less
+efficient than the write barrier for a ``MutVar#``, but this is more
+than compensated for by losing a layer of indirection.


Has the effect of "losing a layer of indirection" been measured and benchmarked? Instinctively, this seems correct to me, that removing a layer of indirection is more beneficial than the performance hit of the GC write barrier for a mutable constructor.

Other people have experimented with hacks to eliminate this layer of indirection (e.g. @ekmett's Struct and @fryguybob's TStruct) and found it worthwhile. Remember that you only pay the performance hit of the write barrier if you ask for it, and it's only likely to be a tiny bit more expensive than the MutVar# write barrier (which you would be paying anyway), if at all.

The performance benefits from my TStruct can be seen on slide 26 here:

http://www.cs.rochester.edu/u/ryates/files/ryates-IFL2016-slides.pdf

@simonmar can this bit be extended to say why the GC barrier would be less efficient, since it's not obvious to me (so probably not obvious to lots of people).

@dcoutts I think it is the granularity of the dirty header that is different. A mutable pair made of three heap objects will mark exactly which pointer is dirty while a single heap object mutable pair traverses both pointers even if only one changed.

Two reasons: one is the lack of accuracy that @fryguybob mentioned, and the other is more mundane: we know statically the name of stg_MUT_VAR_CLEAN_info and its DIRTY equivalent, but for an arbitrary mutable constructor we will have to fetch these from the info table.

@simonmar: do you think a layout where we store the number of mutable pointers in the payload might work? Then we could have just two new closure types: MUT_CTR_CLEAN and MUT_CTR_DIRTY, and the layout of the payload would be: [ number of mutable pointers | mutable pointers | non-mutable pointers | non-pointers ]. If we allowed ourselves an extra word in the payload, we may even have some bits for a mask to specify exactly which fields are actually dirty?

Based on my limited understanding, would it be possible to lay out DIRTY and CLEAN info tables in a way that the offset between them is statically known? e.g. MutPair_CLEAN_info+0x420 == MutPair_DIRTY_info.

I successfully implemented that by having a field in each info table that pointed to the other info table. They do, as far as I can tell, end up statically offset from each other but it was simple enough to add a field.

That's very interesting! Ideally, I'd hope we can get rid of that additional field, by allocating CLEAN and DIRTY tables next to each other, offset by what I hope is a (info table agnostic) constant. Quite similar to TABLES_NEXT_TO_CODE.

cartazio · 2016-09-20T12:22:36Z

Edward K has a package , Structs, that effectively allows unboxed
references in a potentially recursive setting. It's worth looking at as a
point of comparison perhaps

On Tuesday, September 20, 2016, ramonkeypr notifications@github.com wrote:

@ramonkeypr commented on this pull request.

In proposals/0000-mutable-fields.rst
#8 (review)
:

+generate code for other constructors, taking care to add the Void
+argument for the State#, and generating an info table with the
+correct information about the mutable fields.
+
+Unpacking constructors with mutable fields
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+UNPACK must be a no-op on constructors with mutable fields. There's
+no sensible way to make UNPACK work with mutable fields.
+
+Drawbacks
+---------
+
+The GC write barrier for a mutable constructor may be a little less
+efficient than the write barrier for a MutVar#, but this is more
+than compensated for by losing a layer of indirection.

Has the effect of "losing a layer of indirection" been measured and
benchmarked? Instinctively, this seems correct to me, that removing a layer
of indirection is more beneficial than the performance hit of the GC write
barrier for a mutable constructor.

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#8 (review),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAAQwoIf5HDBgszUTJGoFyziBLBKqnq_ks5qr3nggaJpZM4KBKtX
.

cartazio · 2016-09-20T12:27:51Z

Cc @ekmett , this seems very related to your structs package / codes.

formatting tweak

goldfirere · 2016-09-20T14:35:12Z

From a surface syntax standpoint, I don't like the magical use of IOField. What if we introduce a new keyword mutable and require the GADT syntax at the bottom of the proposal?

data MutPair :: Type -> Type -> Type where
  MutPair :: mutable a -> mutable b -> IO (MutPair a b)

Differentiating between an IOField and a STField could be done by looking at the monad mentioned in the return type.

adamgundry · 2016-09-20T14:55:15Z

Can I declare a mutable field with record syntax? If so, what are the typing rules for record selection and update?

rwbarton · 2016-09-20T18:19:16Z

Would it be feasible to support mutable unboxed fields? It would be nice to have an "IORef Int#". Presumably Ref would need to gain an argument to describe the representation of the referent.

dcoutts · 2016-09-20T19:04:54Z

I very much like the fact that this involves the compiler properly knowing about the mutable fields and generating info tables, since this means profiling should work, whereas with @ekmett's cunning hack all your mutable types are the same as far as the profiler can tell.

Will there also be a Ref type, as well as Ref#? It'd be nice if error messages can mention Ref rather than scary # types :-)

remiturk · 2016-09-20T20:58:58Z

Something I've been working on in my own (unpublished, broken, incomplete & in eternal development hell) unboxed-records-with-first-class-labels-library is the freezing & thawing of records, analogous to the way it works for e.g. (im)mutable vectors. That way my main simulation core can run blazingly fast in ST while calculating statistics after a round can be done in a nicely pure interface.

Do you see (the value of) any way of extending the current proposal later to support something similar?

rwbarton · 2016-09-20T21:59:33Z

Will it be possible to compare such constructors for reference equality? How would that look?

fryguybob · 2016-09-20T22:25:09Z

I have a bunch of additional features that I want and don't need to be fully worked out yet, but might influence decisions about syntax now. One consideration would be including memory model information. In the GADT form this could be something like AtomicRef#. Another extension is arrays with ArrayRef#. I want this feature in the context of STM too which would likely add some metadata to the representation. This might be just using STM instead of IO in the GADT syntax. I also want sum types. All together I would like to be able to write this for a Hashed Array Mapped Trie:

data Node k v where
    Level :: Word -> ArrayRef# (Node k v) -> STM (Node k v)
    Leaf :: Hash -> ArrayRef# (k,v) -> STM (Node k v)

or

data Node k v where
    Level :: Word -> ArrayRef# (Node k v) -> STM (Node k v)
    Leaf :: Hash -> k -> v -> STM (Node k v)
    Leaves :: Hash -> ArrayRef# (k,v) -> STM (Node k v)
    Nil :: STM (Node k v)

or even better:

data Node k v where
    Level :: Word -> ArrayRef# (Node k v) -> STM (Node k v)
    Leaf :: Hash -> k -> v -> Node k v
    Leaves :: Hash -> ArrayRef# (k,v) -> STM (Node k v)
    Nil :: Node k v

bgamari

Quite a neat idea, but I have some questions.

bgamari · 2016-09-20T22:17:23Z

proposals/0000-mutable-fields.rst

+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+UNPACK must be a no-op on constructors with mutable fields.  There's
+no sensible way to make UNPACK work with mutable fields.


Is this to say that we are unable to unpack even immutable fields within a constructor also containing mutable fields? It's not immediately obvious to me why this should be the case. It would be nice to clarify here.

I believe this refers only to unpacking a mutable constructor into another mutable constructor. But it would be nice to unpack the reasoning there. I think it goes like this:

"While C++ or Rust allow unpacking a mutable product type into another, this requires a recursive notion of object construction/initialization and would not fit here. If we tried the following:

data MutPair2 = MP Int {-# UNPACK #-} MutPair

what is the type of the MP constructor? It cannot be this:

MP :: Int -> MutPair -> IO MutPair2

because the only way to make that work would be to copy the mutable record MutPair into the MP constructor. This is (1) not what we'd want for including inner objects as value types, and (2) ruins the guarantee that UNPACK is a performance hint rather than semantically important."

@rrnewton exactly; I've expanded that section with your text.

bgamari · 2016-09-20T22:35:00Z

proposals/0000-mutable-fields.rst

+
+The ``MutPair`` constructor has the following type::
+
+  MutPair :: a -> b -> IO (MutPair a b)


It seems very odd to me to change the type of the constructor like this; this feels like a very ugly exception where the moment you add a mutable field the behavior of your constructor changes in a very fundamental way. It almost feels as though there should be some syntactic indicator to mark these special constructors (e.g. data MutPair = mutable MutPair (IOField a) (IOField b), which could also be used in the GADT case).

I tried playing around with a few alternative approaches to provide the "constructor action" via a typeclass (such that we could just hide the constructor itself), but it sadly seems to get quite complicated. My initial thought was that you could identify a constructor with its promoted kind (e.g. 'MutPair), but the presence of the field arguments in its kind make this rather awkward. To make this fly we'd need to have a better way to represent constructor identity at the type level.

Another approach would be to build this on the existing Generics infrastructure, although this also seemed quite complicated and not necessarily safe.

Regardless, while changing the type of the constructor itself is awkward, it may be the least of the evils here.

CC @RyanGlScott in case he has thoughts about Generics in this context.

rrnewton · 2016-09-20T23:45:24Z

What are the rules for what monad goes on the RHS of a constructor signature? It seems like there is a spectrum here between (1) only supporting raw State# types, and (2) giving special treatment to IO/ST/STM.

simonmar · 2016-09-21T00:56:16Z

@adamgundry Record selection works smoothly I think - the record selector returns a Ref# s a, and the typing is exactly as if you'd written the record selector by hand. Record construction should also work nicely, although of course the record constructor has an IO or ST type. Record update should probably be disallowed for a record with mutable fields. I'll add some notes about this to the proposal.

simonmar · 2016-09-21T01:27:02Z

@rwbarton It would be nice to allow mutable unboxed fields. In fact, there's no reason why Ref# shouldn't be able to take a representation-polymorphic type argument, since that doesn't affect the representation of the Ref# itself. However, we would need a family of primitives for reads and writes at the different unboxed types. I'll add a section about this to the proposal.

simonmar · 2016-09-21T02:10:53Z

@goldfire yes GADT syntax gives us a great way to specify whether you want IO or ST (or later STM), so I agree this is the way to go. There is still magic though: mutable a turns into a for the constructor, but IOField a when pattern matching. I'm not crazy about adding a new keyword, but I don't have a better suggestion so I think we could go with this. Anyone else have thoughts before I update the proposal?

- Add a section on records - Add a section on mutable unboxed fields - Add various unresolved questions - Move the unresolved GC section to a TODO

cartazio · 2016-09-21T02:22:58Z

I would hope that it's not hard coded to just ST and IO, at the very least I've my own third sibling ,
STE https://hackage.haskell.org/package/monad-ste that is valid in any use site of MonadPrim or or PrimBaseMonad along with ST and IO

Would this boil down to "is the monad thing a new type around the raw internal rep that we use for IO Or ST"?

Granted there's then several parallel yaks around fields that form valid mvars or stm variables I guess ?

simonmar · 2016-09-21T02:26:17Z

@fryguybob why do you say we would need a different AtomicRef#? I imagined we would just have atomic primops that operate on the ordinary Ref#, in the same way that we have atomic primops for MutVar#.

fryguybob · 2016-09-21T12:49:56Z

@simonmar right, this choice would go along with having a memory model. C++11/14/17 whatever distinguishes atomic locations and concurrent accesses (with at least one a write) outside of those locations are undefined behavior. Other approaches may work but at a minimum we should be ensuring that enough information gets down to say the LLVM level, if it is doing the code generation. For instance we may know that an atomic release store on TSO is a normal store, but we don't want LLVM to reorder that store in a basic block. I imagine we are doing the right thing now for this particular case, but it may be very convenient to distinguish explicitly synchronization locations when it does come to having a memory model. In addition, I think volatile is considered orthogonality in LLVM preventing things like accumulating in a register from a value that may be accessed in the same thread from a signal handler.

ggreif · 2016-09-21T20:12:57Z

Just leaving my reddit comment here for consideration...

Yes, I'd also prefer some syntactic clue, that something different is going on.

data MutPair a b in IO = MutPair (IOField a) (IOField b)

sounds like a possibility.

wdanilo · 2018-08-17T00:07:35Z

@fryguybob I was writing a message earlier to thank you for such detailed answer but it seems I've never sent it, so thank you!

Btw, did anything change regarding this topic? I know it's not the highest priority but I'm waiting for it like crazy now! :)

fryguybob · 2018-08-17T22:44:37Z

@wdanilo I don't have any major updates at this point. I will probably have discussions with people about working toward an implementation that can be merged in at some point at Haskell Symposium.

Ericson2314 · 2018-08-17T23:44:23Z

If the linear types proposal is accepted, we could perhaps someday purely do this at the library level. I'd be interested to see that design fleshed out.

winterland1989 · 2018-08-18T05:09:58Z

I think it's worthwhile to describe the difference between a primitive mutable field and a lifted one in full detail because a closure with only primitive mutable fields do not need write barrier, while lifted fields require them. I even prefer the extension can be split into two variations: MutableFields and PrimMutableFields for different performance consideration.

yav · 2018-08-20T07:11:51Z

@winterland1989 I would expect the compiler to just do "the right thing" when emitting code (e.g., if it does not need a write barrier it should not emit one). What would be the benefit of having two separate extensions?

winterland1989 · 2018-08-20T11:32:10Z

@yav The benefit of a separate PrimMutableFields is twofold:

Currently, we don't have primitive references. Users simulate it with MutableByteArray#, but it carries an extra word (length which is always 1). PrimMutableFields can provide a good replacement.
I believe PrimMutableFields have a different impact on implementation and performance, and some operations should be limited to primitive fields, such as fetch-and-add.

Of course, if we can state all these differences clearly and the compiler is smart enough to always do the right thing, a unified extension is OK.

fryguybob · 2018-08-20T19:45:15Z

@winterland1989 I think we should be fine with just one extension. In my implementation you can specify an unlifted type directly and it does what you would expect. Matching on an unlifted field gives a RefU# type which supports only unlifted types. Writing to a non-pointer field does not incur the GC write barrier. If you have a constructor with only non-pointer fields I do still generate two info tables. All the information is available at the right place for me to improve that, however.

I think some sort of levity polymorphism would be needed to handle what is normally done with strictness and unpacking. If you had something like:

data M a where
    MkM :: {-# UNPACK #_} mutable !a -> IO (M a)

When you match on an M Int do you get a Ref# or a RefU#? I haven't gone down that rabbit hole yet.

winterland1989 · 2018-08-21T03:15:03Z

@fryguybob The unpacking example probably shouldn't work since we don't allow unpack polymorphic field anyway. But we probably shouldn't allow unpacking arbitrary concrete type either.

Do you think it helps if we introduce another mutable# keyword? We can limit the types which mutable# accept (and unpack), matching on a mutable# field will always result in a RefU#, which may support exotic atomic operations.

treeowl · 2021-02-16T20:12:53Z

Is someone still working on this?

Boarders · 2021-02-16T20:27:40Z

@treeowl I've wondered the same thing, I think this would open up lots of interesting Haskell performance which didn't rely on strange aspects of ArrayArray# and the like.

fryguybob · 2021-02-16T20:51:40Z

I have worked on it some in the last year when I had a little time.

https://github.com/fryguybob/ghc/tree/wip/ryates/mutable-fields-9

It is sort of in a state of minimal working example for only IO based mutable constructor fields.

ramirez7 · 2021-02-16T20:59:02Z

Now that we have -XLinearTypes, I think there's room in this proposal for a third Data.Mutable API as well. Maybe something like:

module Data.Mutable.Linear where
  type Mutable a = ??? -- Unsure what this should be
  readRef :: Mutable a %1-> (Ur a, Mutable a)
  writeRef :: a -> Mutable a %1-> Mutable a

I'd imagine if we wanted to leverage LTs for this proposal, more changes would also be needed.

sgraf812 · 2021-02-17T11:58:28Z

I would be very happy to see progress here! Thanks for your efforts, @fryguybob.

I'm not convinced considering linear types adds any value to this proposal. Sure, the API will be unsafe, but then you can always expose a safe, linearly-typed layer over it, as is the case for mutable arrays, for example. The primitives are ST/IO-based (e.g. state-token passing based), and it works well enough. Unless I see a use case that really needs specific treatment for linear types, I don't see why we should complicate the proposal even more.

ramirez7 · 2021-02-17T17:30:13Z

@sgraf812 If I understand this proposal correctly, that route would not work very well. Correct me if I'm wrong:

The proposed extension generates Mutable fields in IO or ST, and when you pattern match on the record, you're now forced to use either of those. So the benefit of adding LTs as a third option would be to avoid having the end-user (i.e. the person who defined their records) perform unsafe operations. It's not really worth it if you have to write the unsafe IO to LT wrapper for each of your records. This is unlike mutable arrays, which are a single type and therefore can hide the IO completely internally in a single module. Mutable record fields otoh give you that IO/ST everywhere you define them.

More concretely:

I don't think you can safely write this shim:

module Data.Mutable.Linear where
  type Mutable a = Data.Mutable.IO.Mutable a
  readRef :: Mutable a %1-> (Ur a, Mutable a)
  writeRef :: a -> Mutable a %1-> Mutable a

because that would allow you to break referential transparency. There's no guarantee that that IO.Mutable a is used linearly throughout the program. With the linear mutable arrays you mention, you can use the module system to get this guarantee.

So I think that would be the value of including a LT option to this extension. We would need a different Mutable type for LTs that is distinct, and we would then know that all Linear.Mutable values are used linearly, which in turn allows for this safe, linear, pure API. And frees the end-programmer from unsafe boilerplate.

Whether or not that should go in this extension or a future incremental one 🤷‍♂️ This proposal has plenty of value as-is and great work already towards implementation thanks to @fryguybob So a future incremental one may make more sense, especially since -XLinearTypes is so new and we are still learning our way around it as a programming community.

sgraf812 · 2021-02-17T19:54:15Z

Correct me if I'm wrong: The proposed extension generates Mutable fields in IO or ST, and when you pattern match on the record, you're now forced to use either of those.

Basically, yes. I'm not familiar with the implementation, but I'm pretty sure that internally we won't have "implementations in IO or ST", but just one that uses state-token passing, e.g. State# s -> (# State# s, t #), which is representationally equal to ST s t and to IO t if s is specialised to RealWorld#. Note that the arrow there is basically a linear arrow for all intends and purposes (it seems that linear-base even redefines it as such), but it is not really checked by GHC (which has led to bugs in the past) and it doesn't matter for surface language semantics, because it's an implementation detail of the compiler. (Whether the compiler should make more use of linear types in its PrimOps is an entirely unrelated discussion.)

"Make it compatible with Linear Types" isn't a demand that makes sense; You can always use an unsafe non-linear arrow and coerce it into a linear one, offering a safe API over that. You restrict the programs the user can write, thereby making an API that previously allowed unsafe operations into one that only allows safe programs by construction.

With that in mind...

So the benefit of adding LTs as a third option would be to avoid having the end-user (i.e. the person who defined their records) perform unsafe operations. It's not really worth it if you have to write the unsafe IO to LT wrapper for each of your records.

I think you want new syntactic sugar for linear types and mutable record fields. E.g. if you have

data MutPair a b where
  MutPair :: { mutFst :: mutable a
             , mutSnd :: mutable b
             } -> IO (MutPair a b}

You want the corresponding safe, linear API to be "autogenerated" (to the extent that I understand -XLinearTypes):

newtype LMutPair a b = LMutPair' (MutPair a b) -- constructor is internal!
pattern LMutPair :: a -> b -> LMutPair a b
get_mutFst :: LMutPair a b %1-> (a, LMutPair a b) 
get_mutSnd :: LMutPair a b %1-> (b, LMutPair a b)
set_mutFst :: LMutPair a b %1 -> a %1-> LMutPair a b 
set_mutSnd :: LMutPair a b %1 -> b %1-> LMutPair a b

set_mutFst (LMutPair' p) a = unsafePerformIO (writeRef (mutFst p) a) `pseq` LMutPair' p

The other functions are defined similarly to set_mutFst, using an abundance of unsafePerformIO. I already see a bunch of new concerns:

Should the functions operate in a Linear state monad directly?
Is get_mutFst :: LMutPair a b -> a OK?
get_ and set_ is ugly, I want record accessors and update syntax! Can't we integrate it with one of the record field proposals?

Etc. Finally, I don't see why you couldn't just use TH to derive this API from the MutPair data type. Heck, even some Generic-based implementation should do it if we can have generic-lens and you prefer that.

This is unlike mutable arrays, which are a single type and therefore can hide the IO completely internally in a single module. Mutable record fields otoh give you that IO/ST everywhere you define them.

I don't think it's different at all. A data type mit mutable fields that is meant to be used linearly should define a corresponding linear API. linear-base defines its own shim over mutable arrays, for example. If you want to retro-fit a linear API on a mutable data constructor, use the corresponding TH function. Another point for this derivation business not to end up in a language extension.

Ericson2314 · 2021-02-17T19:58:50Z

Frankly, I wish this and the linear types extension were co-designed in the past, rather than trying to retrofit one onto the other after the fact. Being able to tie together the type system changes to low level FFI stuff like this would have been wonderful.

sgraf812

This proposal was probably written before the general proposal skeleton was added. Nevertheless, I think it would benefit from being adpated to fit that skeleton.

sgraf812 · 2021-02-17T20:06:03Z

proposals/0000-mutable-fields.rst

+- ``Data.Mutable.IO.Mutable a``, if the constructor has an ``IO``
+  return type
+- ``Data.Mutable.ST.Mutable s a``, if the constructor has an ``ST s``
+  return type, or a ``State# s -> (# State# s, t #)`` return type


This seems very crude. Ideally, we'd have something like PrimMonad as a builtin thing and infer PrimMonad m => m t here.

I don't think we'll see that anytime soon, so an alternative would be a sort of "PrimMonad light":

module Data.Mutable where class PrimMonadLight m where -- feel free to bike-shed type Mutable m :: Type -> Type -- or `forall r. TYPE r -> Type` to accomodate runtime-rep poly readRef :: Mutable a -> m a writeRef :: Mutable a -> a -> m ()

And corresponding instances for IO and ST.

sgraf812 · 2021-02-17T20:16:03Z

proposals/0000-mutable-fields.rst

+- The new ``mutable`` keyword declares a mutable field
+- The return type is in ``IO``: a constructor with any ``mutable``
+  fields *must* have a return type that has one of the forms ``IO t``,
+  ``ST s t``, or ``State# s -> (# State# s, t #)``, where ``t`` takes


Similar to my comment on 1.2.2, it would be much nicer to require a a -> b -> (forall m. PrimMonad m => m (MutPair a b)) return type instead of this quite arbitrary list of permitted types. Although that is not really simple and begs the question whether we can rid ourselves of the constraint in all cases...

sgraf812 · 2021-02-17T20:16:23Z

proposals/0000-mutable-fields.rst

+  fields *must* have a return type that has one of the forms ``IO t``,
+  ``ST s t``, or ``State# s -> (# State# s, t #)``, where ``t`` takes
+  the form of the normal return type for the constructor. In the
+  latter two cases, the type variable ``s`` must appear in ``t``.


Suggested change

latter two cases, the type variable ``s`` must appear in ``t``.

latter two cases, the type variable ``s`` must not appear in ``t``.

sgraf812 · 2021-02-17T20:27:10Z

proposals/0000-mutable-fields.rst

+
+A ``Ref#`` represents a mutable field of a constructor.  Although ``Ref#``
+appears to be a normal first-class primitive type, its *runtime
+representation* will be ``(# Any, Int# #)``, that is, an unboxed pair


So why invent an entirely new runtime-rep Ref then? Hopefully that is just a synonym for TupleRep [ BoxedRep Lifted, IntRep ]. Also shouldn't this proposal allow mutable fields in unlifted data types (the proposal of which has been accepted, #265)? In that case, Ref# should take an argument l :: Levity. E.g.

Ref# :: forall (l :: Levity). * -> * -> TYPE (TupleRep [ BoxedRep l, IntRep ])`

But this proposal appears to rely on Any, which must have lifted rep today, so maybe supporting unlifted data types isn't possible.

I assume the reason to store the offset separately is to be able to insert the write barrier for the object that contains the field. Is that correct? I'd appreciate a forward reference if so.

It is talking about the implementation here, how the reference itself can be represented by pairing a pointer to a heap object with a mutable field, with the offset to the field. My implementation started before levity polymorphism was completed so I actually have types for lifted and unlifted values in the field that the references points too. I couldn't avoid making a Ref# type as we need to know rather late how to generate code for accesses to references.

sgraf812 · 2021-02-17T20:28:58Z

proposals/0000-mutable-fields.rst

+
+So, ``{-# UNPACK #-} !T`` cannot do anything if ``T`` is a type with
+mutable constructors.  However, ``UNPACK`` annotations can be used as
+normal on immutable fields in the definition of a mutable constructor.


Does that mean UNPACK on a field with a mutable constructor is an error? Or is it just ignored?

I would prefer an error.

sgraf812 · 2021-02-17T20:48:55Z

proposals/0000-mutable-fields.rst

+
+`Eq` is supported by using ``reallyUnsafePtrEquality#`` to compare
+mutable constructors, but we must ensure that the constructors are
+evaluated strictly in the same way as we do for ``dataToTag#``.


I really don't want that kind of derived Eq instance. I find it very misleading; derived Eq instances normally determine structural equality, not referential equality. For example, users might add a mutable field to some pre-existing record and then wonder why the semantics of the derived Eq instance change completely.

I don't even see a reason why the Eq instance couldn't just compute structural equality! A field being mutable doesn't mean reading it is impossible. I certainly would advocate for deriving to have the same semantics as today if it doesn't lead to an error.

I don't even see a reason why the Eq instance couldn't just compute structural equality! A field being mutable doesn't mean reading it is impossible. I certainly would advocate for deriving to have the same semantics as today if it doesn't lead to an error.

That would be ideal, and is what Rust does already, but it unfortunately violates referential transparency: reading from a mutable field must be done in a monad, and none is available in a call to Eq. Perhaps deriving Eq should be an error, with a separate type class for pointer equality?

sgraf812 · 2021-02-17T20:51:28Z

proposals/0000-mutable-fields.rst

+
+- Its constructor has the declared type
+- It has identity, and equality is implemented using pointer equality
+  (see "Deriving" above).


Well, that's why you want that weird Eq instance. I don't like it.

There are really no alternative implementations I can think of that do not violate referential transparency. One could make this an error, though.

sgraf812 · 2021-02-17T21:01:44Z

proposals/0000-mutable-fields.rst

+operation creates a new one that should compare equal only to itself.
+Therefore the usual optimisation for nullary constructors whereby we
+share a single global copy of the constructor does *not* apply to
+mutable constructors, and we must always allocate them in the heap.


We have Unique for that use case. It performs better and is less quirky wrt. evaluatedness.

If you want to somehow optimise Eq instances for pointer equality, I'd much rather support a design like https://arxiv.org/pdf/2003.01685.pdf. E.g., export a primitive

-- | Semantically equivalent to `\a b k -> k`, but GHC is free to return `True` -- if `a` and `b` are referentially equal and omit evaluating `k`. GHC.Exts.withPtrEq :: a -> a -> Bool -> Bool instance Eq (MutPair a b) where MutPair a1 b1 == MutPair a2 b2 = withPtrEq a1 a2 (a1 == a2) && withPtrEq b1 b2 (b1 == b2)

sgraf812 · 2021-02-17T21:14:03Z

proposals/0000-mutable-fields.rst

+     = ui,       otherwise
+
+Primitives
+^^^^^^^^^^


Should this still be nested under "Core"?

sgraf812 · 2021-02-17T21:26:02Z

proposals/0000-mutable-fields.rst

+* Can we add a way to include mutable arrays in a constructor?
+* It would be great to allow STM as an option in addition to IO and
+  ST.  The constructor will need to store extra metadata, because
+  TVar# is more complex than MutVar#.


Sounds like another argument for PrimMonadLight. I'm having a hard time imagining what the intended semantics would look like for mutable fields in STM though...

STM with mutable fields was the central part of my thesis and why I made an implementation. The difficulty with pushing that to a high level with something like PrimMonad is we need extra fields in STM heap objects (at least that is a desirable design choice).

treeowl · 2021-02-17T21:32:53Z

@sgraf812 my concern about PrimMonad specifically is that it is tied into GHC's internal representations of IO and ST. Exposing that in a language feature seems quite wrong. I wouldn't be opposed to using some internal version that doesn't expose that. I am curious whether some linearity-based approach could unify these in a more principled way, but I don't know nearly enough to guess.

sgraf812 · 2021-02-17T21:40:58Z

I don't think we need to expose the methods of PrimMonad (although we could in a GHC-specific module), it suffices to specify the instances. We do the same for Coercible.

Also note that the proposal already exposes the internal State# s -> (# State# s, t #) type by offering it as one of 3 alternative return types in the constructor signature. If anything, hiding all 3 types behind the rather abstract PrimMonad constraint makes the proposal less implementation specific.

As I said above, I don't see how linear types help. They are just in the type system and have no operational nature. You'd need unsafe primops and hide them behind a linear API (unless you want to extend Core with new LT-specific primitive operations...), which I did in the comment above, using primops this proposal provides.

simonmar · 2021-02-23T16:32:53Z

Is someone still working on this?

I still think this proposal is important, but my bandwidth to work on it is very limited so I'd be more than happy if someone else wanted to pick it up and drive it.

sgraf812 · 2022-01-24T17:42:18Z

From the perspective of GHC, it would be so good to have some form of this proposal implemented. I believe that we could have significant savings in allocations in GHC by using a transient (e.g. locally mutable and selectively frozen) IntMap data structure backing UniqFM/VarEnv/InScopeSet/whatnot. Each update/insertion into these structures cost us a full re-allocation of the spine. Very costly if they are huge! I think we are suffering death by a thousand papercuts.

Unfortunately, a good source level representation of opt-in mutability eludes me.

santiweight · 2022-11-28T20:21:25Z

Fwiw: we would use this at work if it were implemented! We need it because we have a system in which we have many small components which are being mutated at high rates. There is an overhead to StateT that is becoming too large for us, and IORef is also not great.

It's unclear to me whether Simon's original proposal or a linear-types-focused version would have better usability for our use case. The nice thing for us about a linear-types version is that it would the mutation would be able to live outside of IO.

DemiMarie · 2022-11-29T08:09:01Z

proposals/0000-mutable-fields.rst

+possible to eliminate a layer of indirection in mutable data
+structures in a type-safe way, where currently this can only be done
+by using ``unsafeCoerce`` tricks; see for example `the structs
+package <http://hackage.haskell.org/package/structs>`_.


Suggested change

package <http://hackage.haskell.org/package/structs>`_.

package <https://hackage.haskell.org/package/structs>`_.

DemiMarie · 2022-11-29T08:09:57Z

proposals/0000-mutable-fields.rst

+As a special case, this proposal also provides for the declaration of
+constructors with *identity*. That is, constructors that are created
+by an explicitly effectful operation, and that can be compared using
+pointer-equality.  Currently only a few built-in types like ``IORef``


Does the pointer equality need to be a primitive, or can it be implemented with a magic type class and reallyUnsafePtrEquality#?

DemiMarie · 2022-11-29T08:11:09Z

proposals/0000-mutable-fields.rst

+- mutable fields *must not* have a strictness annotation. (we
+  anticipate that support for strictness annotations on mutable fields
+  will be a future proposal).


I recommend having the default be strict, as that is more likely to be what people who care about performance actually want.

DemiMarie · 2022-11-29T08:16:35Z

proposals/0000-mutable-fields.rst

+
+So, ``{-# UNPACK #-} !T`` cannot do anything if ``T`` is a type with
+mutable constructors.  However, ``UNPACK`` annotations can be used as
+normal on immutable fields in the definition of a mutable constructor.


I would prefer an error.

DemiMarie · 2022-11-29T08:18:53Z

proposals/0000-mutable-fields.rst

+
+`Eq` is supported by using ``reallyUnsafePtrEquality#`` to compare
+mutable constructors, but we must ensure that the constructors are
+evaluated strictly in the same way as we do for ``dataToTag#``.


I don't even see a reason why the Eq instance couldn't just compute structural equality! A field being mutable doesn't mean reading it is impossible. I certainly would advocate for deriving to have the same semantics as today if it doesn't lead to an error.

That would be ideal, and is what Rust does already, but it unfortunately violates referential transparency: reading from a mutable field must be done in a monad, and none is available in a call to Eq. Perhaps deriving Eq should be an error, with a separate type class for pointer equality?

DemiMarie · 2022-11-29T08:20:15Z

proposals/0000-mutable-fields.rst

+because the only way to make that work would be to copy the mutable
+record ``MutPair`` into the ``MP`` constructor. This is (1) not what
+we'd want for including inner objects as value types, and (2) ruins
+the guarantee that ``UNPACK`` is a performance hint rather than
+semantically important.


From an imperative language background, it seems that this is what one would want, at least assuming that GHC can avoid the allocation in most cases. UNPACK might be the wrong pragma, though.

DemiMarie · 2022-11-29T08:20:44Z

proposals/0000-mutable-fields.rst

+
+- Its constructor has the declared type
+- It has identity, and equality is implemented using pointer equality
+  (see "Deriving" above).


There are really no alternative implementations I can think of that do not violate referential transparency. One could make this an error, though.

first draft of mutable fields proposal

2be3db6

ramonkeypr reviewed Sep 20, 2016

View reviewed changes

formatting tweak

0bdcdbe

simonmar added 2 commits September 20, 2016 22:30

Merge pull request #1 from ggreif/patch-1

5728a48

formatting tweak

Can we get rid of MutVar#?

31bbe80

bgamari reviewed Sep 20, 2016

View reviewed changes

Updates in response to comments

bf4fb1c

- Add a section on records - Add a section on mutable unboxed fields - Add various unresolved questions - Move the unresolved GC section to a TODO

simonmar added 3 commits September 21, 2016 11:30

Clarify UNPACK using text from @rrnewton

bad7c29

formatting nit

db29853

another formatting nit

ee99f19

bravit added the Proposal label Nov 11, 2018

bravit removed the Proposal label Dec 3, 2018

bravit unassigned rrnewton Dec 3, 2018

nomeata force-pushed the master branch 3 times, most recently from 6b33e58 to e7fdbc7 Compare June 11, 2019 12:04

deepfire-pusher mentioned this pull request Oct 15, 2019

Local modules #283

Open

sgraf812 reviewed Feb 17, 2021

View reviewed changes

DemiMarie reviewed Nov 29, 2022

View reviewed changes


		The ``MutPair`` constructor has the following type::

		MutPair :: a -> b -> IO (MutPair a b)

	latter two cases, the type variable ``s`` must appear in ``t``.
	latter two cases, the type variable ``s`` must not appear in ``t``.

	package <http://hackage.haskell.org/package/structs>`_.
	package <https://hackage.haskell.org/package/structs>`_.

Mutable constructor fields #8

Are you sure you want to change the base?

Mutable constructor fields #8

Conversation

simonmar commented Sep 20, 2016

simonmar commented Sep 20, 2016

Ericson2314 commented Sep 20, 2016

Ericson2314 commented Sep 20, 2016 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yav May 25, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cartazio commented Sep 20, 2016

@ramonkeypr commented on this pull request.

cartazio commented Sep 20, 2016

goldfirere commented Sep 20, 2016

adamgundry commented Sep 20, 2016

rwbarton commented Sep 20, 2016

dcoutts commented Sep 20, 2016

remiturk commented Sep 20, 2016

rwbarton commented Sep 20, 2016

fryguybob commented Sep 20, 2016

bgamari left a comment

Choose a reason for hiding this comment

bgamari Sep 20, 2016 • edited

Choose a reason for hiding this comment

rrnewton Sep 20, 2016 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bgamari Sep 20, 2016 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rrnewton commented Sep 20, 2016

simonmar commented Sep 21, 2016

simonmar commented Sep 21, 2016

simonmar commented Sep 21, 2016

cartazio commented Sep 21, 2016

simonmar commented Sep 21, 2016

fryguybob commented Sep 21, 2016

ggreif commented Sep 21, 2016

wdanilo commented Aug 17, 2018

fryguybob commented Aug 17, 2018

Ericson2314 commented Aug 17, 2018

winterland1989 commented Aug 18, 2018 • edited

yav commented Aug 20, 2018

winterland1989 commented Aug 20, 2018

fryguybob commented Aug 20, 2018

winterland1989 commented Aug 21, 2018

treeowl commented Feb 16, 2021

Boarders commented Feb 16, 2021

fryguybob commented Feb 16, 2021

ramirez7 commented Feb 16, 2021 • edited

sgraf812 commented Feb 17, 2021

ramirez7 commented Feb 17, 2021

sgraf812 commented Feb 17, 2021

Ericson2314 commented Feb 17, 2021

sgraf812 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Ericson2314 commented Sep 20, 2016 •

edited

yav May 25, 2017 •

edited

bgamari Sep 20, 2016 •

edited

rrnewton Sep 20, 2016 •

edited

bgamari Sep 20, 2016 •

edited

winterland1989 commented Aug 18, 2018 •

edited

ramirez7 commented Feb 16, 2021 •

edited

sgraf812 commented Feb 17, 2021 •

edited