lines
- replacement for line-seq that returns an auto-closeable iterable and cannot cache nor hold-onto-head the data.re-matches
- faster version of re-matches.
- split caffeine support off into its own namespace.
- impl/pmap really does support user-defined thread pool.
- Add clj-kondo exports and config, fix linting errors
- Remove support for and call to
take-last
1-arity, which was not valid. - Fix variable arity
merge-with
, which was not correctly implemented. apply-concat
,concat-opts
accept cat-parallelism option allow you to specify how the concatenation should be parallelized at the creation source as opposed to at the preduce/parallel reduction callsite.
- Faster compose-reducers especially where there really are a lot of reducers.
- issue 13 - any IMutList chunkedSeq was partially incorrect.
- frequencies respects map-fn option to allow concurrent hashmaps to be used.
cartesian-map
no longer has a random access variant. The cooler version of this uses the tensor address mechanism to allow parallel redution.- fixed major issue with parallel frequencies.
- Much faster every? implementation esp. for primitive arrays and persistent vectors.
- More hlet extensions -
lng-fns
anddbl-fns
which are faster in the general case thenlngs
anddbls
as they avoid RT/nth. - efficient
cartesian-map
which does a cartesian join across its inputs and calls f on each value.
user> (hamf/sum-fast (lznc/cartesian-map
#(h/let [[a b c d](lng-fns %)]
(-> (+ a b) (+ c) (+ d)))
[1 2 3]
[4 5 6]
[7 8 9]
[10 11 12 13 14]))
3645.0
- Extensible let - hlet and helpers make using the primitive overloads of clojure functions easier. See the ham-fisted.hlet and ham-fisted.primtive-invoke namespaces.
- typed 'nth' methods efficient for primitive manipulations - 'dnth', 'fnth', 'inth', 'lnth'.
- Custom reduce implemented for object array wrappers.
2.007
- reduce namespace now has helper to create a parallel reducer.
- hashset has optimized addall pathway when input is another hashset.
- partition-by accepts a predicate function in options - example in docs.
- implemented lazy noncaching partition-all - similar perf to partition-by.
- Faster default dispatch for pgroups, upgroups.
- Faster sum-fast if input is random access.
- Faster sort implemented as default in several places.
- Ensure all object sorting is done with parallelQuickSort.
- Small fix to array macros to use l2i instead of RT.intCast.
- Major issue in compose-reducers - object composition was typed to double reduction.
- slightly faster partition-by - inner loop written in java.
- Added lazy-noncaching partition-by. This method has somewhat higher performance than clojure.core/partition-by as it does not make intermediate containers and is strictly lazy-noncaching.
- Rebuilt hashmaps on faster foundation especially for micro benchmarks.
- Removed bits and pieces that do not provide enough return on investment.
- For more in-depth comments see PR-7.
- new linked hashmap implementation with equiv-semantics and fast union op.
- Fast set intersection for longer sequencers of sets (intersect-sets).
- Fast pathways for finding min/max index of a collection of objects in a similar way to min-key and max-key.
- Very specific upgrade to combine-reducers pathways.
- pmap, upmap pathways now return an object that full implements seqable and ireduceinit.
- Added n-lookahead to the parallel options pathway as for some problems this makes a major difference in the efficiency of the pmap pathway.
- Fixed pmap implementation to release memory much more aggressively.
- Fixed serious but subtle issue when a transient hash map is resized. This should be considered a must-have upgrade.
- Fixed from upgrading dtype-next.
- Major breaking changes!! API functions have been moved to make the documentation clearer and the library more maintainable in the long run.
- map-union of mutable or transient maps produces mutable or transient maps!
- Final refactoring before 1.000 release.
- Functions to make creating java.util.function objects are moved to ham-fisted.function.
- Reduction-related systems are moved to ham-fisted.reduce.
- java.util.Map helpers are moved to ham-fisted.mut-map.
- java implementation of a batched stream reducer. This avoids adding java to the stream api.
- pure java reductions for situations where you have an index fn a count. These mainly just make benchmarks a bit more stable.
- IFnDef supports interfaces for long supplier (L), double supplier (D), and Supplier (O).
- Accelerated map boolean union, intersection, difference for hashtable, long hashtable.
- Major bugfix in map dissoc.
- Added inc-consumer - returns a generic consumer that increments a long. Useful for the various situations where you need to track an incrementing variable but don't want the overhead of using a volatile variable.
- Immutable maps and vectors derive from APersistentMap and APersistentVector so that downtream libraries can pick them up transparently.
- Error in reduction of empty ranges.
- Slightly faster map construction pathways.
- In fact both the hamf base map
mut-map
and the integer-specializedmut-long-hashtable-map
are faster than the defaultclojure.data.int-map
pathway for construction and value lookup according to the benchmarks inclojure.data.int-map
. Interestingly enough they are fastest if you create an intermediate object array using lznc/apply-concat:
user> (count entries)
1000000
user> (c/quick-bench (into (persistent! (hamf/mut-long-hashtable-map)) entries))
Evaluation count : 6 in 6 samples of 1 calls.
Execution time mean : 366.247830 ms
Execution time std-deviation : 11.024896 ms
Execution time lower quantile : 348.564420 ms ( 2.5%)
Execution time upper quantile : 376.625750 ms (97.5%)
Overhead used : 1.492920 ns
nil
user> (c/quick-bench (into (i/int-map) entries))
Evaluation count : 6 in 6 samples of 1 calls.
Execution time mean : 568.189038 ms
Execution time std-deviation : 1.677163 ms
Execution time lower quantile : 566.564884 ms ( 2.5%)
Execution time upper quantile : 570.564253 ms (97.5%)
Overhead used : 1.492920 ns
nil
user> (def ll (into (persistent! (hamf/mut-long-hashtable-map)) entries))
#'user/ll
user> (def il (into (i/int-map) entries))
#'user/il
user> (c/quick-bench
(dotimes [idx (count entries)]
(.get ^java.util.Map ll idx)))
Evaluation count : 84 in 6 samples of 14 calls.
Execution time mean : 7.399383 ms
Execution time std-deviation : 62.314286 µs
Execution time lower quantile : 7.297162 ms ( 2.5%)
Execution time upper quantile : 7.451677 ms (97.5%)
Overhead used : 1.492920 ns
nil
user> (c/quick-bench
(dotimes [idx (count entries)]
(.get ^java.util.Map il idx)))
Evaluation count : 30 in 6 samples of 5 calls.
Execution time mean : 22.936125 ms
Execution time std-deviation : 583.510734 µs
Execution time lower quantile : 22.334216 ms ( 2.5%)
Execution time upper quantile : 23.654541 ms (97.5%)
Overhead used : 1.492920 ns
nil
user>
user> (c/quick-bench (hamf/mut-long-hashtable-map (hamf/into-array Object (lznc/apply-concat entries))))
Evaluation count : 6 in 6 samples of 1 calls.
Execution time mean : 271.755568 ms
Execution time std-deviation : 1.545379 ms
Execution time lower quantile : 270.184413 ms ( 2.5%)
Execution time upper quantile : 273.736344 ms (97.5%)
Overhead used : 1.492920 ns
nil
wrap-array
,wrap-array-growable
, major into-array optimizations and bettermap-reducible
.
- Opening the door to custom IReduce implementations.
- convert hashsets to use hashtables instead of bitmap tries.
- careful analysis of various vec-like object creation mechanisms.
- long primitive hashtables - these are quite a bit faster but especially when used directly.
- Helpers for very high performance scenarios. lazy-noncaching/map-reducible, api/->long-predicate.
- fill-range is now property accelerated making all downstream projects that use addAll and friends far faster.
- see commit 38596d8
- Added group-by-consumer - this has different performance and functionality characteristics
than group-by-reducer. For instance, group-by-consumer with a linked hashmap will return
a map with keys in the order of keys initially encounted. group-by-reducer with the same
hashmap will return a map with keys in the order of latest encountered. Group-by-consumer
uses
computeIfAbsent
which is a slightly faster primitive thancompute
as it doesn't need to check the return value of the reducer, only of the initialization of the map entry.
- MapForward class so we can use normal java maps in normal Clojure workflows.
- bugfix - Map's
compute
has to accept nil keys.
- memoize now supports
:eviction-fn
- for callbacks when things get evicted. - More helpers for memoized fns - cache-as-map, evict-memoized-call.
- Switch to caffeine for memoize cache and standard java library priority queue for take-min. This removed the dependency on google guava thus drastically cutting the chances for dependency conflicts.
- HUGE CHANGES!!! - moved to hashtable implementation for main non-array map instead of bitmap trie. This is because in all my tests it is much faster for everything aside from non-transient (reduce assoc ...) type loops which are a waste of time to begin with.
- Because there are now three full map implementations (array, trie, hashtable) there is a more defined map structure making it less error prone to test out different map backends.
- Lots of inner class renaming and such - however
frequencies
,group-by-reduce
, andmapmap
now a bit faster - about 2X. Here is a telling performance metric:
({:construct-μs 2.726739663709692,
:access-μs 1.784634592104282,
:iterate-μs 2.7345543552812073,
:ds-name :java-hashmap}
{:construct-μs 3.5414584143710885,
:access-μs 2.761234751112207,
:iterate-μs 2.1730894775185403,
:ds-name :hamf-hashmap}
{:construct-μs 6.475808180747403,
:access-μs 2.484804237281106,
:iterate-μs 1.8564705765641765,
:ds-name :hamf-transient}
{:construct-μs 11.43649782981362,
:access-μs 5.152473242630386,
:iterate-μs 8.793332955848225,
:ds-name :clj-transient})
ham-fisted.hash-map-test>
- Faster
mode
. - Faster map iteration.
- Corrected clojure persistent hash map iteration.
- Faster
mode
. mmax-key
- use(mmax-key f data)
as opposed to(apply max-key f data)
. It is faster and handles empty sequences. Same goes formmin-key
.
- Fixed
make-comparator
. - Added
mode
.
- Better obj->long and obj->double pathways that will always apply the appropriate cast and thus have 0 arg variants.
- Better/faster sort-by pathway that avoids potential intermediate data creation.
- Fixed predicate, long-consumer, double-consumer and consumer pathways.
- Faster dispatch for preduce.
- Faster dispatch for preduce.
- All lists are comparable.
- nil is convertible to iterable and collections without fail.
- Container reduction must respect reduced - I think this is a design flaw but not a serious or impactful one aside from requiring more complex per-container reduction code.
- Perf tweaks and small fixes from TMD.
- Additional round of optimizations around creation of persistent vector objects.
- renamed a few of the functor-creation macros.
- lznc/map explicity supports long->obj transformations as these are often used as index->obj lookup systems.
- Rebuilt IMutList's toArray pathway to use reduction.
- Switched completely to clojure.core.protocols/CollReduce.
- Removed a solid amount of cruft and simplified reduction architecture.
- Now loading hamf transparently makes reductions on all arrays and many java such as hashmaps datastructures faster.
- Removed lots of old cruft.
- Added IFnDef predicates so you can use IFn-based predicates from java.
- Removed lots of old cruft.
- Added IFnDef predicates so you can use IFn-based predicates from java.
:unmerged-result?
,:skip-finalize?
options forpreduce
andpreduce-reducer
. This allows you to use the parallelized reductions pathway but get a sequence of results back as opposed to a single result. It also allows you to used reducers or transducing-compatible rfn's that have no parallel merge pathway and handle the parallel merge yourself after the parallelized reduction.- Fixed issue with single-map parallel reductions to ensure that it passes the parallel reduction request to its source data.
- bulk union, intersection operations.
- Faster
equiv
for longs and doubles but equivalent for everything else.
- additional set operation - parallelized
unique
. - exposed indexed accumulator macros in api for use outside library.
- generic protocol fn add-fn that must return a reduction compatible function for a given collection.
- forgot type hints on array constructors.
- Final round of optimizations for double array creation. Turns out reductions really are faster.
- macros for double, float, long, and int array creation that will inline a fastpath if the argument is a compile-time vector or integer. Bugfix for casting floats to longs.
- shorthand macros, ivec, lvec, fvec, dvec to create array-backed containers that allow nth destructuring.
- major double-array, float-array, long-array, int-array optimizations.
- Set protocol to supercede the set protocol from dtype-next.
- lots and lots of fixes from dataset work.
- Small fixes and making helpers public for dtype-next work.
- long lists really are long lists - copy-paste mistake from int lists.
- declare-double-consumer-preducer! - given type derived from DoubleConsumer and a few others, create a parallel reducer.
- declare-consumer-preducer! - similar to above, incoming data is not expected to be a stream of double values.
- ->collection is protocol driven allowing new non-collection things like bitmaps to be turned temporarily into collections. This means that reductions and collection conversion are protocol driven.
- maps are iterable...
- Explicit protocols for serial and parallel reduction and reducers.
- Explicit support for BitSet objects.
- Updated (->reducible) pathways to check for protocol reducer support.
- Protocol for conversion of arbitrary types to iterable for map, filter support.
- Fixed comparison of seq with nonseq.
- Small perf enhancements from tmd perf regression
- Added in explicit checks for long, double, and predicate objects in filter's reduction specializations. Potentially these are too expensive but it does help a bit with longer sequences.
- Changed things such that Double/NaN evaluates to false. This matches that the null object evaluates to false and null evaluates to Double/NaN.
- Added finalize method to reducers to match transducer spec.
- Exposed
compose-reducers
that produces a new reducer from a map or sequence of other reducers. - These changes simplfied
reduce-reducers
andpreduce-reducers
,sum
andsum-fast
.
- Enable parallelization for instances of clojure.core.PersistentHashMap.
- protocol-based parallelization of reductions so you can extend the parallelization to new undiscovered classes.
- reducer-xform->reducer - Given a reducer and a transducer xform produce a new reducer that will apply the transform to the reduction function of the reducer:
ham-fisted.api> (reduce-reducer (reducer-xform->reducer (Sum.) (clojure.core/filter even?))
(range 1000))
#<Sum@70149930: {:sum 249500.0, :n-elems 500}>
- Finally a better api to group-by-reduce and group-by can now be implemented via group-by-reduce. group-by-reduce uses same 3 function arguments as preduce so your reduction systems are interchangable between these two systems.
- Fixed
conj
for all growable array lists. - Added a protocol for parallel reductions. This allows you to pass in one object and transform it into the three functions required to do a parallel reduction.
- Added preduce-reducer, preduce-reducers for a single reducer or a sequence or map of reducers, respectively.
- group-by, group-by-reduced fixed for large n.
- min-n is now a long with parallel options.
- lazy-noncaching namespace now has map-indexed. Faster reductions and random access objects stay random access.
- Various bugfixes from dtype work.
- Ranges with more than Integer/MAX_VALUE elems can be accessed via their IFn overloads and support custom lgetLong and lgetDouble methods that take long indexes for long and double ranges.
- Use lookahead and put timeouts for all parallelization primitives so that if a long running parallelization is cancelled the forkjoin pool itself isn't hung.
- Enable long and double ranges whose size is larger than Integer/MAX_VALUE. This includes parallelized reductions which even optimized take basically forever.
- Add better defaults for reductions to long and double -specific IMutList interfaces.
- Ensure reduction implementations do not dereference a reduced accumulator.
- Fix reducible interface to it matches preduce.
- Added persistent! implementation which fails gracefully if input is already persistent.
- Fixed group-by, group-by-reduce, and pfrequencies implementation to use preduce.
- conj works on map, filter, and concat from the lazy-noncaching library.
- Remove explicit support for boolean primitives.
- IFnDef overloads implement their appropriate java.util.function counterparts.
- Removed claypoole from dependencies.
- Move typed clojure function interface definitions from Reductions to IFnDef.
- Added overrides of keys, vals that produce parallelizable collections if the input itself is a parallelizable collection - either maps from this library or any java hashmap.
- preduce has new option to help parallelize concat operations - they can be parallelized two different ways, either elemwise where each container parallelizes its reduction or by sequence where an initial reduction is done with pmap then the results are merged.
- all random access contains support spliterator and typed stream construction.
- Fix bug in upmap causing hanging with short sequences.
- double conversion to long fails for NaN.
- Careful combining of typed map/filter chains to avoid causing inaccuracies when converting from double to long.
- Major parallelism upgrade - spliterator-based objects such as java.util.hashmap and all the hashmaps/hashsets from this library now support parallelized reduction.
- Numeric values are range checked on input to addLong.
- Removed ensureCapacity from IMutList.
- Moved to double reduction as opposed to double foreach. Perftested heavily and found that reduce is just as fast and more general.
- expose sublistcheck.
- Stricter correctness checking for sublist types, everything implements Associative.
- Correctness fixes for pmap, upmap, pgroups, upgroups.
- Fixed sum for large n-elems.
- upgroups - Unordered parallel groupings for random access systems.
- Indexed consumers for copying, broadcasting type operations.
- Reducible interface for objects that can reduce themselves.
- ArraySection is now first-class, will rebase dtype-next array pathways on this.
- pmap is guaranteed not to require
shutdown-agents
.