Sunday 15 August 2010

java - Collision Resolution : Quadratic Probing vs. Separate Chaining -


OK, so I'm experimenting with hash tables and different collision resolution problems. I am trying to figure out that for turning resolution, it is more efficient to find a hash table using different chaining or quadratic probes. My result shows that different chaining for small load factors like 0.4 or 0.2 is faster than quadatic investigation. Is this the case or my results are wrong?

between resource cost The outlook is that & nbsp; (With Chenning)
& nbsp; - dereferencing an indirection, i.e. pointer vs


(With quadratic probes)
& nbsp; - Evaluation of a [simple but overall] arithmetic formula - & nbsp; - Sequence to new location - & nbsp; - Possible repeats (due to conflicts between the check value and non-target values ​​stored at these locations, there is no need to worry about some of the chaining.

So it should be a little surprising That the chaining is fast ; pointer diarfencing is the "native" instruction of most CPUs, for indexing in comparative (in most cases similar) array, excluding arithmetic operations and Some CPU instructions will be required to check the formula of the sequence (in order to start StepNR, transfer some at the beginning of step NR, add the current location / check). It will be necessary, which easily slows down manually with pointer deferencing. (Poss. Caveat: 'edit' Soon after this, as it discusses that CNP-S

The quadratic (or other forms) chaining are

    Li>
  • Simple logic for storage management (no dynamic allocation)
  • Sharp inserts (due to simple storage)
  • Normally the need for storage is reduced < / Li>

Think about this space vs speed (or insert-time vs search-time) in very broad conditions , Chengning's overhead store Used for storing pre-calculated values ​​of the [what will happen] with the investigation "hint" for h (most do not consider the potential heap-management overhead for pointers) goes. Since these calculations are easily done, the chaining approach is faster at search time.
edit (thanks, ants of ants)
warning of this argument [about pre-cognitive places] that while modern cpu and their cache, while remembering the cache The cost of running a small calculation can be much lower than approaching [data], it shows that sequentially (or more generally physical to the collision location Examining the works that produce places in the form From) can beat a chaining strategy due to the lower ratio of cache misses. In this light, the investigation that is purely sequential investigation is the best function, because its very simple calculation, but more importantly, because it maximizes the barriers to cash hits. Keeping this in mind, when the hash function is well distributed and the weight factor is small (hence the initial collision with a small / local search path) with a linear (or very local) investigation approach Should use ; Someone should check for tasks that provide a search path that is not physically local.


Particularly comment on the experiment mentioned in the question , for example not to know the size of the hash (if this shape is in sync with the words / registers in the CPU , Arithmetic can be sharp), or not to know the collision ratio (we assume a good, well-distributed hash function). Keep experimenting with this, it will be interesting to gather different time / data to reach your hash slot versus items, which create conflicts.

" Even " in " ... even for small load factors ... "Your expectation indicates that the relative advantage of Chenning should move forward with load, hence, because the collisions are very high, I also hope that this is the case. Besides this, increasing the load can clarify yet another deficiency of the investigation : When examining cases and / or more generally when items that do not fit specifically.


No comments:

Post a Comment