Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Significant New Features - We recommend you consider using these features right away.

General

 DeleteLogicalFiles in a PROJECT or APPLY requires NOTHOR

...

https://track.hpccsystems.com/browse/HPCC-10815

ECL using the 4.0.4 DeleteOwnedSubFiles feature does not compile on 4.2.0 (fixed in 4.2.2)

A function was added in 4.0.4 called DeleteOwnedSubFiles, to allow subfiles to be removed from a superfile if they were only referenced from that one superfile.

...

https://track.hpccsystems.com/browse/HPCC-10288

In some cases, the standard Python libraries do not link statically to the Python core (fixed in 4.2.2)

On some distros (Centos in particular) the standard Python 2.6 packages have been built in such a way that the standard Python libraries do not link statically to the Python core. This results in undefined symbols (typically _Py_ZeroStruct) when trying to execute embedded Python code that uses one of these libraries. The workaround is to add the following code into the top of /opt/HPCCSystems/sbin/hpcc_setenv (note that the name will vary by distro):

...

From release 4.2.2, code has been added to the plugin that supports embedded Python to perform this load automatically, so the workaround will not be required.

The expression EXISTS(a + b + ... + last) occasionally being evaluated as EXISTS(last) (Fixed in 4.2)

This happened only if the expression was evaluated inside a transform/filter, rather than generating a child query.

...

https://track.hpccsystems.com/browse/HPCC-10309

Superfile issue causing incorrect compression symptoms (fixed in 4.2.2)

This problem was seen when using a superfile with a few superfiles as subfiles, the 1st of which was empty. This causes the filedescriptor for the superfile to incorrectly configure the common shared attributes, in particular it fails to set @blockCompression which causes the engines to think the file is uncompressed when in fact all files are compressed.

This in turn leads to the deserialization problems, which in the case reported resulted in the deserializer trying to allocate a massive buffer running out of memory. Other symptoms could include deserialization errors relating to reading beyond the end of stream, or record size mismatch errors.

https://track.hpccsystems.com/browse/HPCC-10319

New environment.conf setting using epoll() instead of select() (Fixed in 4.2)

When listening for input on a number of sockets (dafilesrv and Thor do so quite often), we now use the epoll() system call rather than the select() system call. This can be much more efficient when large numbers of sockets are involved.

...

https://track.hpccsystems.com/browse/HPCC-9415

Persist file per code-hash (Fixed in 4.2)

In previous versions of the platform, a persist generated a single output file with a name that matched the string supplied within the persist(). It was rebuilt if a query was submitted which was based on different code, or the input files had changed.

...

https://track.hpccsystems.com/browse/HPCC-10022

Other improvements to the PERSIST functionality (Fixed in 4.2)

Improvements have also been made to the way that the expiry of persist files is handled. Rather than calculating the expiry date as a fixed time from when the file was created, we now track the last access to a file and expire based on that.

...

https://track.hpccsystems.com/browse/HPCC-9985

Improvements to the version of JOIN that allows an ATMOST attribute with a condition and a limit (Fixed in 4.2)

The previous implementation was not quite correct for global joins in thor. It sorted and distributedtheinput files by id and name meaning that there might be some values of id on different nodes and so potentially losing some matches.  Now it distributes by id only but sorts by id and name.

...

https://track.hpccsystems.com/browse/HPCC-9711

Improvements to ENUM where the first element matches a typedef name could affect existing layouts (fixed in 4.2)

Code which attempted to use an typedef-ed type as the base type for an enumeration would not be interpreted as the user expected. This could lead to unexpected behaviour in any code that attempted to use this feature. The enumerated type name would instead be treated as the first element of the enumeration.

...

https://track.hpccsystems.com/browse/HPCC-9553

Optimization of MANY LOOKUP

In previous releases, a MANY LOOKUP join with a high number of matching right-hand-side key values, caused a severe degradation in speed. The symptom would be seen as a large delay after the lookup join had read all its right hand sided rows. In severe cases, this gave the impression that the JOIN had stalled before outputting any matches. This has been resolved in 4.2.

Significant New Features

New ZAP button in ECL Watch for information gathering when reporting issues

The ZAP (Zipped Analysis Package) button is located on the workunits details page in ECL Watch.

...

https://track.hpccsystems.com/browse/HPCC-7899

New Group Join

The new GROUP JOIN syntax allows you to efficiently join two datasets on one condition, but have the result grouped by another condition. This is useful for efficiently solving some relationship matching problems. As a first approximation the following ECL:

...

https://track.hpccsystems.com/browse/HPCC-9951

A new flexible lookup join - JOIN, SMART

A SMART join attempts to perform an in-memory LOOKUP join. If there is insufficient memory, smart join will automatically ensure that both sides are efficiently distributed and attempt to perform a LOCAL LOOKUP join.

...