Class Histogram
- Direct Known Subclasses:
SummingHistogram
Histogram: an array of non-negative integer numbers b[v], 0≤v<M,
where every element b[v] represents the number of occurrence of the value v
in some source array A, consisting of integer elements in 0..M−1 range.
The integer values v in the source array are 31-bit:
0≤M<231 (int Java type). The elements (bars) of the histogram
b[v] and also the total number of elements in the source array N are 63-bit:
0≤N<263 (long Java type).
The source array A is always supposed to be sorted in increasing order:
This class is designed for solving 2 basic tasks:
- to find the value v=A[r], if we know its index r in the sorted array A: this value is called the percentile #r (well-known special case is the median, when r=N/2);
- to find the index r in the sorted array A, if we know the corresponding value v=A[r]: this value is called the rank of the value v.
It's important that this class does not store and does not try to sort the source array A, it stores only the histogram b and solves both tasks on the base of it.
According the simple definition above, the rank and the percentile are integer numbers. However, this class works not with integer, but with real precision, using some generalization of the integer rank and percentile to the floating-point case. Below is the formal definition of the real rank and percentile.
Definition of floating-point percentile v(r) and rank r(v) Let b[0..M−1] be an array of non-negative integer numbers, called the histogram (and stored by this class), and let N be the sum of all these elements (the length of the supposed, but not stored source sorted array A). In addition, we suppose that there is an additional "virtual" element b[M], which is always zero. The elements b[k] are called the bars of the histogram.
This class operates with the function v(r), called the percentile, and its inverse function v(r), called the rank. Both variables r and v are real (floating-point) numbers in the following ranges:
0.0≤r≤N ,0.0≤v≤M . The definition of both functions depends on so called histogram model. This class supports two following models.
- Simple histogram model
In this case, r(v) is a simple polyline generalization of the integer function: while v is increasing from some integer v0 to the next integer
v0+1 (some bar of the histogram), the rankr(v) is either the constant ifb[v0]=0 , or uniformly increased from some r0 tor0+b[v0] ifb[v0]>0 . More precisely:
- v(r) = v0 + (r−r0)/b[v0], where v0 is the minimal integer value so that b[0]+b[1]+...+b[v0]>⌊r⌋ (here and below ⌊x⌋ means the integer part of x or (long)x for our non-negative numbers), and r0 = this sum−b[v0] =
b[0]+b[1]+...+b[v0−1] . In the only special case r=N this formula is not correct, because the sum of histogram bars b[k] cannot be greater than the sum of all bars N, and we consider, by definition, thatv(N) = limx→N+1v(x) = ⌊v(N−1)⌋+1 .
Note: according this definition,v(0) = min (k∈Z: b[k]>0) andv(N) = ⌊v(N−1)⌋+1 = max (k∈Z: b[k]>0)+1 .
In a case N=0 (empty histogram), v(r) functions is considered to be undefined.
- r(v) = r0 + (v−v0)*b[v0], where v0=⌊v⌋ and r0 = b[0]+b[1]+...+b[v0−1].
Note: according this definition, r(v)=0 when v<v(0) and r(v)=N when v>v(N). In particular,r(0)=0 andr(M)=N (remember thatb[M]=0 by definition).
Unlike v(r), r(v) function is defined also if N=0: in this case it is the zero constant.It's easy to see that the r(v) function is continuous, but, unfortunately, the v(r) function is discontinuous if some bars b[k] are zero. This behaviour is not too good for calculating percentiles on sparse histograms, when a lot of bars are zero. In this case, this generalization to floating-point almost does not improve the precision of the calculated percentile.
To better understand the sense of this model, please also read the comments to
value(long[], double)
method.- Precise histogram model
Here v(r) and r(v) are also polyline functions, and they are identical to these function of the simple model in all points where the rank is integer (r∈Z). However, if the rank is not integer, we consider, by definition, that v(r) is uniformly increased from v(r0) to
v(r0+1) , wherer0=⌊r⌋ . In other words, we interpolate the percentile v(r0) between integer ranks. More precisely:
- v(r) =
As in the simple model, in the special case r=N we consider, by definition, that
- v0 + (r−r0)/b, where v0 is the minimal integer value so that
b[0]+b[1]+...+b[v0]>⌊r⌋ ,r0 = b[0]+b[1]+...+b[v0−1] ,b = b[v0] — if b>1 andr0 ≤ r ≤ r0+b−1 = r0' ;- v0' + (r−r0')*(v1−v0'), where
v0' = v0+(b−1)/b = v(r0') andv1 = min (k∈Z: k>v0 & b[k]>0) = v(r0'+1) is the next non-zero histogram bar — ifr0'≤r≤r0'+1=r1 . In a case r>N−1, there is no the next non-zero bar and we consider, by definition, thatv1 = v0+1 .v(N) = limx→N+1v(x) = ⌊v(N−1)⌋+1 .
As in the simple model, herev(0) = min (k∈Z: b[k]>0) andv(N) = ⌊v(N−1)⌋+1 = max (k∈Z: b[k]>0)+1 .
In a case N=0 (empty histogram), v(r) functions is considered to be undefined.
- r(v) is defined as the inverse function for v(r) if v(0)≤v≤v(N); outside this range, we consider r(v)=0 when v<v(0) and r(v)=N when v>v(N). As in the simple model, in the special case N=0 we consider, by definition, that r(v)=0 for any v.
In this model both functions v(r) and r(v) are increasing and continuous. But calculations are more complicated. The difference appears if there are empty bars (
b[k]=0 ); namely, in each non-empty barb[v0]=b , followed by empty bars, there is new salient point of the polyline r(v):v = v0' = v0+(b−1)/b , and after it the rank r(v) is increasing until the next rankr(v0)+b during all zero bars following afterb[v0] . This feature does not appear at the right end of the histogram, if all bars following afterb[v0] are zero.Note: the traditional definition of the median of the source array A with even length N=2n is
(A[n−1]+A[n])/2 (if we supposeA[0]≤A[1]≤...≤A[N−1] ). This definition is identical to the percentilev(n−0.5) in the precise histogram model, if all bars contains 0 or 1 elements (allb[k]≤1 ).To better understand the sense of this model, please also read the comments to
preciseValue(long[], double)
method.
This class is optimized for the case, when we already know some corresponding pair r (rank)
and v (percentile), and we need to slightly change the situation: add or remove several A elements,
then, maybe, little increase or decrease the known rank r or the value v and, after this,
quickly find (recalculate) the second element of the pair: correspondingly the percentile v or the
rank r. To do this, this class provides methods include
and
exlcude
, allowing to increment or decrement elements of b array
(this action corresponds to adding or removing elements to/from the source array A),
and supports the concept of the current rank and current value (percentile)
with a necessary set of methods for setting and getting them.
Usually all these method work quickly (O(1) operations), so every time when you correct the histogram
(add/remove elements) or change the current rank or the value, you can immediately read the second element
of the pair (correspondingly the percentile or the rank).
More precisely, in every object of this class, at any moment, there are:
- the current value v: a real number in range 0≤v≤M,
that can be set by
moveToValue(double)
method or got bycurrentValue()
method; - the current simple rank rS: a real number in range 0≤v≤N,
that can be set by
moveToRank(double)
method or got bycurrentRank()
method; - the current precise rank rP: a real number in range 0≤v≤N,
that can be set by
moveToPreciseRank(double)
method or got bycurrentPreciseRank()
method.
The methods include
and exlcude
do not change the current value,
but can change the current simple and precise ranks.
This class guarantees that for any v we have
This class guarantees that for any rS
we have moveToValue(double)
call, the current value is equal
to the argument of this method.
If such situation takes place after moveToRank(double)
call, the current value is equal
to the moveToRank(0)
the current value v
will be equal to moveToRank(N)
the current value v
will be equal to moveToRank(double)
method.
This class guarantees that for any rP
we have moveToValue(double)
method; but after the call
moveToPreciseRank(0)
the current value v
will be equal to moveToValue(double)
method; but after the call
moveToPreciseRank(N)
the current value v
will be equal to moveToPreciseRank(double)
method.
There are additional methods for setting integer values and ranks:
moveToIValue(int)
and moveToIRank(long)
. They are strictly equivalent
to floating-point methods moveToValue(double)
and moveToRank(double)
,
called with the same argument, but work little faster. There is no special integer version
of moveToPreciseRank(double)
method, because for integer ranks the simple and the precise
histogram models are identical.
There are additional methods for getting integer values and ranks:
currentIValue()
and currentIRank()
.
The currentIValue()
method returns the current value, rounded to an integer.
The rules of rounding are complicated enough and described in the comments to this method;
depending on the situation, it can work as rounding to the nearest integer (Math.round)
or as truncating (Math.floor). In any case, there is a guarantee that
currentIValue()
−0.5≤v<currentIValue()
+1currentIRank()
method return the rank, corresponding to w=currentIValue()
:
it is equal to
You can create an instance of this class by the following methods:
newLongHistogram(int histogramLength, int...bitLevelsOfPyramid)
;newLongHistogram(long[] histogram, int...bitLevelsOfPyramid)
;newIntHistogram(int histogramLength, int...bitLevelsOfPyramid)
;newIntHistogram(int[] histogram, int...bitLevelsOfPyramid)
.
After creation, the only way to change the bars b[k] is using
include(int)
, exclude(int)
, include(int...)
and exclude(int...)
methods.
After creation, the current value v and both current ranks
Sometimes you need calculating (and supporting) not one, but several pairs "current value + current rank"
in the same histogram. If is possible to use a single object of this class and move from one pair to another,
but if the necessary values are not close to each other, the moveTo... methods work relatively
slowly. In this case, you can share the histogram b[k] between several instance of this
class by share()
method. The sharing instances work with the same b[k] array:
any modification by include
/ exclude
methods are immediately reflected
in all shared instances. But each sharing instance has an independent set of current value, current simple rank
and current precise rank. You can get all sharing instances by nextSharing()
method.
This class also provides static methods for calculating value(long[] histogram, double rank)
, value(int[] histogram, double rank)
for the simple histogram model,
preciseValue(long[] histogram, double rank)
, preciseValue(int[] histogram, double rank)
for the precise histogram model.
These methods can be useful if you need to process the given histogram only once.
There is no guarantee that the same results, got by different ways (for example, by static methods and by creating an instance of this class and using its methods) are absolutely identical: little mismatches in the last digits after the decimal point are possible.
This class does not implement own equals and hashCode methods. So, this class does not provide a mechanism for comparing different histograms.
This class is not thread-safe, but is thread-compatible and can be synchronized manually, if multithread access is necessary.
- Author:
- Daniel Alievsky
-
Method Summary
Modifier and TypeMethodDescriptionabstract long
bar
(int value) Returns the bar #value of the histogram: b[value].abstract long[]
bars()
Returns all bars of the histogram.static void
clear
(int[] histogram, int sumOfColumns) Fills all non-zero elements of histogram by 0.abstract long
Returns the rankr(w) of the value w=currentIValue()
.final int
Returns thecurrent value
v, rounded to an integer number.final double
Returns the current precise rank:rP=r(v) in terms of the precise histogram model.final double
Returns the current simple rank:rS=r(v) in terms of the simple histogram model.final double
Returns the current value v.abstract void
exclude
(int value) Decrements the bar #value of the histogram by 1: b[value]--.abstract void
exclude
(int... values) Equivalent to a simple loop of calls ofexclude(int)
method for all passed values.abstract void
include
(int value) Increments the bar #value of the histogram by 1: b[value]++.abstract void
include
(int... values) Equivalent to a simple loop of calls ofinclude(int)
method for all passed values.static int
iPreciseValue
(int[] histogram, double rank) Precise equivalent ofiPreciseValue(long[], double)
for a case of int[] type of the histogram.static int
iPreciseValue
(long[] histogram, double rank) "Interpolated" version ofiValue(long[], long)
, rounded to the "best" integer result.static int
iValue
(int[] histogram, long rank) Precise equivalent ofiValue(long[], long)
for a case of int[] type of the histogram.static int
iValue
(long[] histogram, long rank) Returns the element with the given index rank in the sorted array of integer numbers 0..histogram.length-1, corresponding to this histogram.final boolean
Returns true if and only if the current bar is zero: b[⌊v⌋]=0 (v is thecurrent value
) and all bars from the left are also zero: b[k]=0 for all k<⌊v⌋.final boolean
Returns true if and only all bars from the left of the current bar are zero: b[k]=0 for all k<⌊v⌋ (v is thecurrent value
).final int
length()
Returns the number M of bars of the histogram.abstract Histogram
moveToIRank
(long rank) Sets the current simple rank rS and precise rank rP to be equal of the rank argument.abstract Histogram
moveToIValue
(int value) Sets the current value v to be equal of the value argument.moveToPreciseRank
(double rank) Sets the current precise rank rP to be equal of the rank argument.abstract Histogram
moveToRank
(double rank) Sets the current simple rank rS to be equal of the rank argument.moveToValue
(double value) Sets the current value v to be equal of the value argument.static Histogram
newIntHistogram
(int[] histogram, int... bitLevelsOfPyramid) Creates new 32-bit histogram, consisting of M=histogram.length bars, equal to elements of the given array.static Histogram
newIntHistogram
(int histogramLength, int... bitLevelsOfPyramid) Creates new 32-bit histogram, consisting of M=histogramLength empty bars.static Histogram
newLongHistogram
(int histogramLength, int... bitLevelsOfPyramid) Creates new histogram, consisting of M=histogramLength empty bars.static Histogram
newLongHistogram
(long[] histogram, int... bitLevelsOfPyramid) Creates new histogram, consisting of M=histogram.length bars, equal to elements of the given array.abstract Histogram
Returns the next instance of this class, sharing the histogram array b[k] with this instance.final boolean
Returns true if and only if the current bar is zero: b[⌊v⌋]=0 (v is thecurrent value
) and either all bars from the left, or all bars from the right are also zero.static double
percentile
(int[] histogram, long sumOfColumns, double percentileLevel) Equivalent topreciseValue
(histogram,percentileLevel*(sumOfColumns-1)), if sumOfColumns>0.static double
percentile
(long[] histogram, long sumOfColumns, double percentileLevel) Equivalent topreciseValue
(histogram,percentileLevel*(sumOfColumns-1)), if sumOfColumns>0.static double
preciseValue
(int[] histogram, double rank) Precise equivalent ofpreciseValue(long[], double)
for a case of int[] type of the histogram.static double
preciseValue
(long[] histogram, double rank) "Interpolated" version ofiValue(long[], long)
.final boolean
Returns true if and only if the current bar is zero: b[⌊v⌋]=0 (v is thecurrent value
) and all bars from the right are also zero: b[k]=0 for all k>⌊v⌋.final boolean
Returns true if and only all bars from the right of the current bar are zero: b[k]=0 for all k>⌊v⌋ (v is thecurrent value
).abstract Histogram
share()
Creates new instance of this class, which uses the same arrays of bars b[k].abstract long
Returns the number of instances of this class, sharing the histogram array b[k] with this instance.static int
sumOf
(int[] histogram) Returns the sum of all elements of the passed histogram array.static long
sumOf
(long[] histogram) Returns the sum of all elements of the passed histogram array.abstract long
total()
Returns the sum N of all bars b[k] of the histogram.static double
value
(int[] histogram, double rank) Precise equivalent ofvalue(long[], double)
for a case of int[] type of the histogram.static double
value
(long[] histogram, double rank) Floating-point version ofiValue(long[], long)
.
-
Method Details
-
newLongHistogram
Creates new histogram, consisting of M=histogramLength empty bars. In other words, all bars b[k]=0 at first; but they can be increased byinclude(int)
method.The bitLevelsOfPyramid argument is used for optimization of large histograms, consisting of thousands or millions bars. Namely, this class automatically builds and supports a pyramid of histograms: m=bitLevelsOfPyramid.length additional arrays
b1[k], b2[k], ..., bm[k] , wherebq[k] = Σ 2s*k ≤ j < min(2s*(k+1), M) b[j], s=bitLevelsOfPyramid[q−1];
bq.length = ⌊M/2s⌋ = histogramLength >> bitLevelsOfPyramid[q−1].In other words, every "sub-histogram" bq consists of "wide" bars, where the width of bars is
2s=bitLevelsOfPyramid[q−1] : it is a sum of2s bars b[j] of the base histogram, excepting the last "wide" bar which can be a sum of less number of bars b[j]. The elements of bitLevelsOfPyramid array must be in 1..31 range and must be listed in increasing order:0<bitLevelsOfPyramid[0]<...<bitLevelsOfPyramid[m−1]≤31 .Supporting this pyramid little slows down
include(int)
andexclude(int)
methods: they require O(m) operations to correct m arrays bq. But it can essentially optimize all methods which set the current value and rank: in the worst case they requireO (bm.length + Σ q=1,2,...,m−12bitLevelsOfPyramid[q−1])
and sequential settings of the current value and rank work much faster if the current value changes slightly.
Without the pyramid of histograms (m=bitLevelsOfPyramid.length=0), the required time of setting the current value or rank is O(M) in the worst case, and sequential settings of the current value and rank also work much faster if the current value changes slightly. If the histogram length M is not large, for example, 256 or less, it is possible that this class will work faster without bitLevelsOfPyramid arguments.
The passed bitLevelsOfPyramid argument is cloned by this method: no references to it are maintained by the created object.
If you are sure that the sum of all bars b[k] (the total length of the supposed source array A) will never exceed Integer.MAX_VALUE, you can use
newIntHistogram(int, int...)
method instead of this one: the created object will probably work little faster and will occupy less memory.- Parameters:
histogramLength
- the number M of bars of the new histogram.bitLevelsOfPyramid
- the bit levels: binary logarithms of widths of bars in the sub-histograms in the "histogram pyramid"; can be empty, then will be ignored (the histogram pyramid will not be used).- Returns:
- the new histogram with zero (empty) bars b[k]=0.
- Throws:
NullPointerException
- if bitLevelsOfPyramid argument is null.IllegalArgumentException
- if histogramLength<0, or if bitLevelsOfPyramid.length>30, or if some of elements bitLevelsOfPyramid is not in 1..31 range, or ifbitLevelsOfPyramid[k] >= bitLevelsOfPyramid[k+1] for some k.
-
newLongHistogram
Creates new histogram, consisting of M=histogram.length bars, equal to elements of the given array. In other words, the barsb[k]=histogram[k] at first.The bitLevelsOfPyramid argument is used for optimization of large histograms, consisting of thousands or millions bars. Namely, this class automatically builds and supports a pyramid of histograms: m=bitLevelsOfPyramid.length additional arrays
b1[k], b2[k], ..., bm[k] , wherebq[k] = Σ 2s*k ≤ j < min(2s*(k+1), M) b[j], s=bitLevelsOfPyramid[q−1];
bq.length = ⌊M/2s⌋ = histogram.length >> bitLevelsOfPyramid[q−1].In other words, every "sub-histogram" bq consists of "wide" bars, where the width of bars is
2s=bitLevelsOfPyramid[q−1] : it is a sum of2s bars b[j] of the base histogram, excepting the last "wide" bar which can be a sum of less number of bars b[j]. The elements of bitLevelsOfPyramid array must be in 1..31 range and must be listed in increasing order:0<bitLevelsOfPyramid[0]<...<bitLevelsOfPyramid[m−1]≤31 .Supporting this pyramid little slows down
include(int)
andexclude(int)
methods: they require O(m) operations to correct m arrays bq. But it can essentially optimize all methods which set the current value and rank: in the worst case they requireO (bm.length + Σ q=1,2,...,m−12bitLevelsOfPyramid[q−1])
and sequential settings of the current value and rank work much faster if the current value changes slightly.
Without the pyramid of histograms (m=bitLevelsOfPyramid.length=0), the required time of setting the current value or rank is O(M) in the worst case, and sequential settings of the current value and rank also work much faster if the current value changes slightly. If the histogram length M is not large, for example, 256 or less, it is possible that this class will work faster without bitLevelsOfPyramid arguments.
The passed histogram and bitLevelsOfPyramid arguments are cloned by this method: no references to them are maintained by the created object.
If you are sure that the sum of all bars b[k] (the total length of the supposed source array A) will never exceed Integer.MAX_VALUE, you can use
newIntHistogram(int[], int...)
method instead of this one: the created object will probably work little faster and will occupy less memory.- Parameters:
histogram
- initial values of the bars b[k] of the histogram.bitLevelsOfPyramid
- the bit levels: binary logarithms of widths of bars in the sub-histograms in the "histogram pyramid"; can be empty, then will be ignored (the histogram pyramid will not be used).- Returns:
- the new histogram with bars b[k]=histogram[k].
- Throws:
NullPointerException
- if histogram or bitLevelsOfPyramid argument is null.IllegalArgumentException
- if some of histogram elements are negative (<0), or if sum of all bars (elements of histogram array) is greater than Long.MAX_VALUE, or if bitLevelsOfPyramid.length>30, or if some of elements bitLevelsOfPyramid is not in 1..31 range, or ifbitLevelsOfPyramid[k] >= bitLevelsOfPyramid[k+1] for some k.
-
newIntHistogram
Creates new 32-bit histogram, consisting of M=histogramLength empty bars. In other words, all bars b[k]=0 at first; but they can be increased byinclude(int)
method. "32-bit" means that the sum of all bars b[k] (the total length of the supposed source array A) will not be able to exceed Integer.MAX_VALUE. If you need to process greater numbers, please usenewLongHistogram(int, int...)
method instead of this one.The bitLevelsOfPyramid argument is used for optimization of large histograms, consisting of thousands or millions bars. Namely, this class automatically builds and supports a pyramid of histograms: m=bitLevelsOfPyramid.length additional arrays
b1[k], b2[k], ..., bm[k] , wherebq[k] = Σ 2s*k ≤ j < min(2s*(k+1), M) b[j], s=bitLevelsOfPyramid[q−1];
bq.length = ⌊M/2s⌋ = histogramLength >> bitLevelsOfPyramid[q−1].In other words, every "sub-histogram" bq consists of "wide" bars, where the width of bars is
2s=bitLevelsOfPyramid[q−1] : it is a sum of2s bars b[j] of the base histogram, excepting the last "wide" bar which can be a sum of less number of bars b[j]. The elements of bitLevelsOfPyramid array must be in 1..31 range and must be listed in increasing order:0<bitLevelsOfPyramid[0]<...<bitLevelsOfPyramid[m−1]≤31 .Supporting this pyramid little slows down
include(int)
andexclude(int)
methods: they require O(m) operations to correct m arrays bq. But it can essentially optimize all methods which set the current value and rank: in the worst case they requireO (bm.length + Σ q=1,2,...,m−12bitLevelsOfPyramid[q−1])
and sequential settings of the current value and rank work much faster if the current value changes slightly.
Without the pyramid of histograms (m=bitLevelsOfPyramid.length=0), the required time of setting the current value or rank is O(M) in the worst case, and sequential settings of the current value and rank also work much faster if the current value changes slightly. If the histogram length M is not large, for example, 256 or less, it is possible that this class will work faster without bitLevelsOfPyramid arguments.
The passed bitLevelsOfPyramid argument is cloned by this method: no references to it are maintained by the created object.
- Parameters:
histogramLength
- the number M of bars of the new histogram.bitLevelsOfPyramid
- the bit levels: binary logarithms of widths of bars in the sub-histograms in the "histogram pyramid"; can be empty, then will be ignored (the histogram pyramid will not be used).- Returns:
- the new histogram with zero (empty) bars b[k]=0.
- Throws:
NullPointerException
- if bitLevelsOfPyramid argument is null.IllegalArgumentException
- if histogramLength<0, or if bitLevelsOfPyramid.length>30, or if some of elements bitLevelsOfPyramid is not in 1..31 range, or ifbitLevelsOfPyramid[k] >= bitLevelsOfPyramid[k+1] for some k.
-
newIntHistogram
Creates new 32-bit histogram, consisting of M=histogram.length bars, equal to elements of the given array. In other words, the barsb[k]=histogram[k] at first. "32-bit" means that the sum of all bars b[k] (the total length of the supposed source array A) cannot exceed and will not be able to exceed Integer.MAX_VALUE. If you need to process greater numbers, please usenewLongHistogram(long[], int...)
method instead of this one.The bitLevelsOfPyramid argument is used for optimization of large histograms, consisting of thousands or millions bars. Namely, this class automatically builds and supports a pyramid of histograms: m=bitLevelsOfPyramid.length additional arrays
b1[k], b2[k], ..., bm[k] , wherebq[k] = Σ 2s*k ≤ j < min(2s*(k+1), M) b[j], s=bitLevelsOfPyramid[q−1];
bq.length = ⌊M/2s⌋ = histogram.length >> bitLevelsOfPyramid[q−1].In other words, every "sub-histogram" bq consists of "wide" bars, where the width of bars is
2s=bitLevelsOfPyramid[q−1] : it is a sum of2s bars b[j] of the base histogram, excepting the last "wide" bar which can be a sum of less number of bars b[j]. The elements of bitLevelsOfPyramid array must be in 1..31 range and must be listed in increasing order:0<bitLevelsOfPyramid[0]<...<bitLevelsOfPyramid[m−1]≤31 .Supporting this pyramid little slows down
include(int)
andexclude(int)
methods: they require O(m) operations to correct m arrays bq. But it can essentially optimize all methods which set the current value and rank: in the worst case they requireO (bm.length + Σ q=1,2,...,m−12bitLevelsOfPyramid[q−1])
and sequential settings of the current value and rank work much faster if the current value changes slightly.
Without the pyramid of histograms (m=bitLevelsOfPyramid.length=0), the required time of setting the current value or rank is O(M) in the worst case, and sequential settings of the current value and rank also work much faster if the current value changes slightly. If the histogram length M is not large, for example, 256 or less, it is possible that this class will work faster without bitLevelsOfPyramid arguments.
The passed histogram and bitLevelsOfPyramid arguments are cloned by this method: no references to them are maintained by the created object.
- Parameters:
histogram
- initial values of the bars b[k] of the histogram.bitLevelsOfPyramid
- the bit levels: binary logarithms of widths of bars in the sub-histograms in the "histogram pyramid"; can be empty, then will be ignored (the histogram pyramid will not be used).- Returns:
- the new histogram with bars b[k]=histogram[k].
- Throws:
NullPointerException
- if histogram or bitLevelsOfPyramid argument is null.IllegalArgumentException
- if some of histogram elements are negative (<0), or if sum of all bars (elements of histogram array) is greater than Integer.MAX_VALUE, or if bitLevelsOfPyramid.length>30, or if some of elements bitLevelsOfPyramid is not in 1..31 range, or ifbitLevelsOfPyramid[k] >= bitLevelsOfPyramid[k+1] for some k.
-
length
public final int length()Returns the number M of bars of the histogram.The result of this method is always non-negative (≥0).
- Returns:
- the number M of bars of the histogram.
-
total
public abstract long total()Returns the sum N of all bars b[k] of the histogram. In other words, it is the current length of the supposed source array A, on the base of which this histogram is built.The result of this method is always non-negative (≥0).
- Returns:
- the sum N of all bars b[k] of the histogram.
-
bar
public abstract long bar(int value) Returns the bar #value of the histogram: b[value]. If the index value is negative or ≥M=length()
, this method returns 0 and does not throw an exception (unlikeinclude
andexclude
methods).The result of this method is always non-negative (≥0).
- Parameters:
value
- the index of the bar; can be out of 0..M range.- Returns:
- the bar #value of the histogram.
-
bars
public abstract long[] bars()Returns all bars of the histogram. The length of the returned array is M=length()
, and the element #k is equal to b[k].The returned array is never a reference to an internal array stored in this object: if necessary, the internal Java array is cloned.
- Returns:
- all bars of the histogram as long[] Java array.
-
include
public abstract void include(int value) Increments the bar #value of the histogram by 1: b[value]++. It can be interpreted as adding one element value to the source array A, on the base of which this histogram is built.If this histogram is 32-bit, that is created by
newIntHistogram(int, int...)
ornewIntHistogram(int[], int...)
method, this method throws IllegalStateException if the current sum of all barstotal()
is Integer.MAX_VALUE (231−1). In other cases, this method throws IllegalStateException if the current sum of all bars is Long.MAX_VALUE (263−1). In a case of throwing an exception, this method does not change the histogram.This method does not change the current value, but can change the current simple and precise ranks. If there are some
sharing instances
, this method change ranks in all sharing instances.- Parameters:
value
- the index of the increased histogram bar.- Throws:
IndexOutOfBoundsException
- if value<0 or value>=M=length()
.IllegalStateException
- iftotal()
==Long.MAX_VALUE (or Integer.MAX_VALUE for 32-bit histogram).- See Also:
-
exclude
public abstract void exclude(int value) Decrements the bar #value of the histogram by 1: b[value]--. It can be interpreted as removing one element, equal to value, from the source array A, on the base of which this histogram is built.If the bar #value of the histogram is zero (b[value]=0), this method throws IllegalStateException and does not change the histogram. So, the bars of the histogram cannot become negative.
This method does not change the current value, but can change the current simple and precise ranks. If there are some
sharing instances
, this method change ranks in all sharing instances.- Parameters:
value
- the index of the increased histogram bar.- Throws:
IndexOutOfBoundsException
- if value<0 or value>=M=length()
.IllegalStateException
- if b[value]=0.- See Also:
-
include
public abstract void include(int... values) Equivalent to a simple loop of calls ofinclude(int)
method for all passed values.The only possible difference from this loop is that the exception, if it occur, can be thrown at the very beginning (but can also be thrown only while increasing the corresponding bar). As well as the simple loop of
include(int)
, this method can be non-atomic regarding this failure: it can increase some histogram bars and then throw an exception.- Parameters:
values
- the indexes of the increased histogram bars. If some index is repeated several times in this array, the corresponding histogram bar will be increased several times.- Throws:
NullPointerException
- if values array is null.IndexOutOfBoundsException
- if some values[k]<0 or values[k]>=M=length()
.IllegalStateException
- iftotal()
>Long.MAX_VALUE-values.length (or Integer.MAX_VALUE-values.length for 32-bit histogram).- See Also:
-
exclude
public abstract void exclude(int... values) Equivalent to a simple loop of calls ofexclude(int)
method for all passed values.The only possible difference from this loop is that the exception, if it occur, can be thrown at the very beginning (but can also be thrown only while decreasing the corresponding bar). As well as the simple loop of
exclude(int)
, this method can be non-atomic regarding this failure: it can decrease some histogram bars and then throw an exception.- Parameters:
values
- the indexes of the decreased histogram bars. If some index is repeated several times in this array, the corresponding histogram bar will be decreased several times.- Throws:
NullPointerException
- if values array is null.IndexOutOfBoundsException
- if some values[k]<0 or values[k]>=M=length()
.IllegalStateException
- if b[values[k]]=0 for some k.- See Also:
-
currentIRank
public abstract long currentIRank()Returns the rankr(w) of the value w=currentIValue()
. (This rank is the same in both histogram models.) In other words, it is the sum of all histogram bars b[k] from the left of the bar #w:Σk<w b[k] . Seecomments to Histogram class
for more details.- Returns:
- the rank of the integer part of the current value.
- See Also:
-
currentRank
public final double currentRank()Returns the current simple rank:rS=r(v) in terms of the simple histogram model. Seecomments to Histogram class
for more details.- Returns:
- the current simple rank.
- See Also:
-
currentPreciseRank
public final double currentPreciseRank()Returns the current precise rank:rP=r(v) in terms of the precise histogram model. Seecomments to Histogram class
for more details.- Returns:
- the current precise rank.
- See Also:
-
currentIValue
public final int currentIValue()Returns thecurrent value
v, rounded to an integer number.The rules of rounding depend on the order of previous calls of 5 methods, changing the current value and ranks:
moveToValue(double)
,moveToRank(double)
,moveToPreciseRank(double)
,moveToIValue(int)
,moveToIRank(long)
. (Calls of any other methods, in particular,include
andexclude
, do not affect to the result of this method.)If the last from those methods was
moveToValue
,moveToRank
,moveToIValue
ormoveToIRank
, then the result of this method is just the integer part of the current value: ⌊v⌋. But aftermoveToPreciseRank
method the returned value will be equal toiPreciseValue(histogram, rank)
, where histogram is this histogram (the result ofbars()
method) and rank is the argument ofmoveToPreciseRank(double)
.The special case
N= is an exception from the last rule: in this case, all 3 methodstotal()
=0moveToPreciseRank
,moveToRank
andmoveToIRank
are equivalent, but the concrete results ofcurrentValue()
method (i.e. v) and this method are not documented — unlike the result ofiPreciseValue
static function, which returns 0 in this case. However, in this case, as after any call ofmoveToRank
/moveToIRank
, there is a guarantee that the result of this method is equal to the integer part of the current value ⌊v⌋.Immediately after creating a new histogram this method always returns 0 (like
currentValue()
).This result of this method always lies between (int)v=⌊v⌋ and
Math.round(v) .- Returns:
- the current value (percentile), rounded to an integer number.
- See Also:
-
currentValue
public final double currentValue()Returns the current value v. It is equal to the percentile of this histogram, found either in terms of the simple histogram model, if before this we calledmoveToRank(double)
method, or in terms of the precise histogram model, if before this we calledmoveToPreciseRank(double)
method. Seecomments to Histogram class
for more details.- Returns:
- the current value (percentile).
- See Also:
-
outsideNonZeroPart
public final boolean outsideNonZeroPart()Returns true if and only if the current bar is zero: b[⌊v⌋]=0 (v is thecurrent value
) and either all bars from the left, or all bars from the right are also zero. Equivalent to .leftFromNonZeroPart()
||rightFromNonZeroPart()
- Returns:
- true if the current bar (containing the current value) and all bars rightward or leftward from it are zero.
-
leftFromNonZeroPart
public final boolean leftFromNonZeroPart()Returns true if and only if the current bar is zero: b[⌊v⌋]=0 (v is thecurrent value
) and all bars from the left are also zero: b[k]=0 for all k<⌊v⌋.- Returns:
- true if the current bar (containing the current value) and all bars leftward from it are zero.
-
leftFromOrAtBoundOfNonZeroPart
public final boolean leftFromOrAtBoundOfNonZeroPart()Returns true if and only all bars from the left of the current bar are zero: b[k]=0 for all k<⌊v⌋ (v is thecurrent value
).- Returns:
- true if all bars leftward from the current value are zero.
-
rightFromNonZeroPart
public final boolean rightFromNonZeroPart()Returns true if and only if the current bar is zero: b[⌊v⌋]=0 (v is thecurrent value
) and all bars from the right are also zero: b[k]=0 for all k>⌊v⌋.- Returns:
- true if the current bar (containing the current value) and all bars rightward from it are zero.
-
rightFromOrAtBoundOfNonZeroPart
public final boolean rightFromOrAtBoundOfNonZeroPart()Returns true if and only all bars from the right of the current bar are zero: b[k]=0 for all k>⌊v⌋ (v is thecurrent value
).- Returns:
- true if all bars rightward from the current value are zero.
-
moveToIRank
Sets the current simple rank rS and precise rank rP to be equal of the rank argument. (Because the argument is integer, both rS and rP ranks are the same.) If the rank argument is negative, it is replaced with 0 (minimal possible rank); ifrank>N= , it is replaced with N (maximal possible rank). Thetotal()
current value
v automatically changes in accordance to the new rank. Seecomments to Histogram class
for more details.In the special case N=0 (all bars of the histograms are zero), both simple and precise ranks are always zero, not depending on calls of this method:
rS=rP=0 (becauser(v) is a zero constant by definition). Butv(r) function (unliker(v) ) is not defined in this case, so, if N=0, the current value v after calling this method is not documented — there is the only guarantee that0≤v≤M .This method works little faster than equivalent calls
moveToRank(rank)
andmoveToPreciseRank(rank)
.- Parameters:
rank
- new rank rS=rP.- Returns:
- the reference to this object.
- See Also:
-
moveToRank
Sets the current simple rank rS to be equal of the rank argument. If the rank argument is negative, it is replaced with 0 (minimal possible rank); ifrank>N= , it is replaced with N (maximal possible rank). Thetotal()
current precise rank
rP and thecurrent value
v automatically change in accordance to the new simple rank. Seecomments to Histogram class
for more details.In the special case N=0 (all bars of the histograms are zero), both simple and precise ranks are always zero, not depending on calls of this method:
rS=rP=0 (becauser(v) is a zero constant by definition). Butv(r) function (unliker(v) ) is not defined in this case, so, if N=0, the current value v after calling this method is not documented — there is the only guarantee that0≤v≤M .- Parameters:
rank
- new simple rank rS.- Returns:
- the reference to this object.
- Throws:
IllegalArgumentException
- if Double.isNaN(rank).- See Also:
-
moveToPreciseRank
Sets the current precise rank rP to be equal of the rank argument. If the rank argument is negative, it is replaced with 0 (minimal possible rank); ifrank>N= , it is replaced with N (maximal possible rank). Thetotal()
current simple rank
rS and thecurrent value
v automatically change in accordance to the new precise rank. Seecomments to Histogram class
for more details.In the special case N=0 (all bars of the histograms are zero), both simple and precise ranks are always zero, not depending on calls of this method:
rS=rP=0 (becauser(v) is a zero constant by definition). Butv(r) function (unliker(v) ) is not defined in this case, so, if N=0, the current value v after calling this method is not documented — there is the only guarantee that0≤v≤M .- Parameters:
rank
- new precise rank rP.- Returns:
- the reference to this object.
- Throws:
IllegalArgumentException
- if Double.isNaN(rank).- See Also:
-
moveToIValue
Sets the current value v to be equal of the value argument. If the value argument is negative, it is replaced with 0 (minimal possible value); ifvalue>M= , it is replaced with M (maximal possible value). Thelength()
current simple rank
rS and thecurrent precise rank
rP automatically change in accordance to the new value. Seecomments to Histogram class
for more details.This methods works little faster than the equivalent call
moveToValue(value)
.- Parameters:
value
- new current value (percentile).- Returns:
- the reference to this object.
- See Also:
-
moveToValue
Sets the current value v to be equal of the value argument. If the value argument is negative, it is replaced with 0 (minimal possible value); ifvalue>M= , it is replaced with M (maximal possible value). Thelength()
current simple rank
rS and thecurrent precise rank
rP automatically change in accordance to the new value. Seecomments to Histogram class
for more details.- Parameters:
value
- new current value (percentile).- Returns:
- the reference to this object.
- Throws:
IllegalArgumentException
- if Double.isNaN(value).- See Also:
-
nextSharing
Returns the next instance of this class, sharing the histogram array b[k] with this instance.All instances, created by
share()
method, are connected into a circular list, and this method returns the next element in this list. For example, if the instance h1 was created bynewLongHistogram(int, int...)
method and, after this, the instance h2 was created ash2=h1. , then this method in h1 object returns h2 and in h2 object returns h1. If there are no sharing instances, this method returns the reference to this instance.share()
You can get all instances, sharing the same array b[k] with the given histogram hist, by the following loop:
Histogram h = hist.nextSharing(); do { // some processing h instance h = h.nextSharing(); } while (h != hist);
See
comments to Histogram class
for more details.- Returns:
- the next instance sharing the histogram array b[k] with this instance,
or the reference to this instance if you did not use
share()
method. - See Also:
-
iValue
public static int iValue(long[] histogram, long rank) Returns the element with the given index rank in the sorted array of integer numbers 0..histogram.length-1, corresponding to this histogram.More precisely, returns minimal integer value v so that
b[0]+b[1]+...+b[v]>rank , b[k]=histogram[k]. If rank≤0, this method returnsmin (k∈Z: b[k]>0) (the minimal element in the source array). Ifrank≥(sum of all b[k]) , it returnsmax (k∈Z: b[k]>0)+1 (the maximal element plus 1). If all columnsb[k] are zero (no elements), this method returns 0.- Parameters:
histogram
- histogram[k] is the number of elements in some source array that are equal to k. All histogram[k] must be non-negative; in other case, IllegalArgumentException can be thrown (but also can be not thrown).rank
- the index in the source array.- Returns:
- value of the found element (percentile).
- Throws:
NullPointerException
- if histogram argument is null.- See Also:
-
iValue
public static int iValue(int[] histogram, long rank) Precise equivalent ofiValue(long[], long)
for a case of int[] type of the histogram.- Parameters:
histogram
- histogram[k] is the number of elements in the source array that are equal to k. All histogram[k] must be non-negative; in other case, IllegalArgumentException can be thrown (but also can be not thrown).rank
- the index in the source array.- Returns:
- value of the found element (percentile).
- Throws:
NullPointerException
- if histogram argument is null.- See Also:
-
value
public static double value(long[] histogram, double rank) Floating-point version ofiValue(long[], long)
. AlikepreciseValue(long[], double)
, this function supposes that the histogram is built on an array of floating-point values after truncating them to an integer value (long)value, but it doesn't try to interpolate value between different bars of the histogram.More precisely, we suppose that if b[k]==b (here and below b[k]=histogram[k]), it means that the source floating-point array contains b values
k+j/b, j=0,1,...,b−1 . With this suggestion, this method finds the element of the source array v1 with the index #r1=⌊rank⌋=(long)rank. Obviously,v1 = v0+(r1-r0)/b[v0] , where v0 is the minimal integer value so thatb[0]+b[1]+...+b[v0]>r1 andr0=b[0]+b[1]+...+b[v0−1] . Then this method returnsv1+(rank−r1)/b[v0] (this value is equal tov0+(rank−r0)/b[v0] ). Please compare: unlikepreciseValue(long[], double)
, we do not find the next element v2 in the following bars of the histogram, but just interpolate between v1 and v2=v1+1/b[v0].As
iValue(long[], long)
, if rank<0, this method returnsmin (k∈Z: b[k]>0) (the minimal element in the source array), and ifrank≥(sum of all b[k]) , it returnsmax (k∈Z: b[k]>0)+1 (for floating-point array, it means the maximal element plus1/b[k] ). If all columnsb[k] are zero (no elements), this method returns 0.The result of this method is equal to the percentile v(r) for the passed r=rank in terms of the simple histogram model: see
comments to Histogram class
for more details.- Parameters:
histogram
- histogram[k] is the number of elements in the source floating-point array that are "almost equal" to k. All histogram[k] must be non-negative; in other case, IllegalArgumentException can be thrown (but also can be not thrown).rank
- the index in the source array (if non-integer, this method returns a real value, which is little greater than the element #r1=(long)rank, but is less than the next element #r1+1 and has the same integer part).- Returns:
- the found value (percentile).
- Throws:
NullPointerException
- if histogram argument is null.IllegalArgumentException
- if Double.isNaN(rank).- See Also:
-
value
public static double value(int[] histogram, double rank) Precise equivalent ofvalue(long[], double)
for a case of int[] type of the histogram.- Parameters:
histogram
- histogram[k] is the number of elements in the source floating-point array that are "almost equal" to k. All histogram[k] must be non-negative; in other case, IllegalArgumentException can be thrown (but also can be not thrown).rank
- the index in the source array (if non-integer, this method returns a real value, which is little greater than the element #r1=(long)rank, but is less than the next element #r1+1 and has the same integer part).- Returns:
- the found value (percentile).
- Throws:
NullPointerException
- if histogram argument is null.IllegalArgumentException
- if Double.isNaN(rank).- See Also:
-
iPreciseValue
public static int iPreciseValue(long[] histogram, double rank) "Interpolated" version ofiValue(long[], long)
, rounded to the "best" integer result. AlikepreciseValue(long[], double)
, this function supposes that the histogram is built on an array of floating-point values after truncating them to an integer value (long)value. In addition topreciseValue
, this function tries to approximate the real result by some nearest integer value.More precisely, we suppose that if b[k]==b (here and below b[k]=histogram[k]), it means that the source floating-point array contains b values
k+j/b, j=0,1,...,b−1 . With this suggestion, this method finds the element of the source array v1 with the index #r1=⌊rank⌋=(long)rank and the element of the source array v2 with the index#r2=r1+1 . Herev1 = v0+(r1-r0)/b[v0] , where v0 is the minimal integer value so thatb[0]+b[1]+...+b[v0]>r1 andr0=b[0]+b[1]+...+b[v0−1] , and there is the analogous formula for v2. If rank argument is integer (rank==(long)rank ), this method does not try to find v2 and just returns v1. Until this moment, this method works likepreciseValue
.After this, the behaviour of this method is more complicated. If rank is not integer, we calculate
v1'=v1+1/b[v1] andv2'=v2+1/b[v2] . Let's consider that the true real values in the source array#r1 (w1) and#r2=r1+1 (w2) are unknown, but lie in rangesv1≤w1<v1' andv2≤w2<v2' . (We really don't know the precise real values, we only know that someb[⌊v1⌋] values lie in⌊v1⌋..⌊v1⌋+1 range and someb[⌊v2⌋] values lie in⌊v2⌋..⌊v2⌋+1 range.) Then the value with real "index" rank, interpolated between w1 and w2, lies in rangea≤w<b , wherea=v1 + (rank−r1) * (v2−v1) (the result ofpreciseValue
call with the same arguments) andb=v1' + (rank−r1) * (v2'−v1') .This method finds the integer range
v..v+1 , which "covers" the range a..b in the best way. Namely, it calculatesv=⌊(a+b)/2⌋ and returns v as the result.The result of this method always lies between (int)p and Math.round(p), where p=
preciseValue
(histogram,rank).As
iValue(long[], long)
, if rank<0, this method returnsmin (k∈Z: b[k]>0) (the minimal element in the source array), and ifrank≥(sum of all b[k]) , it returnsmax (k∈Z: b[k]>0)+1 (for floating-point array, it means the maximal element plus1/b[k] ). Ifrank>(sum of all b[k])−1 , butrank<(sum of all b[k]) , then in formulas above there is no element v2 with the index#r2=r1+1 ; in this case, this method returnsmax (k∈Z: b[k]>0) . If all columnsb[k] are zero (no elements), this method returns 0.- Parameters:
histogram
- histogram[k] is the number of elements in the source floating-point array that are "almost equal" to k. All histogram[k] must be non-negative; in other case, IllegalArgumentException can be thrown (but also can be not thrown).rank
- the index in the source array (if non-integer, this method interpolates nearest elements).- Returns:
- interpolated value of the found element (percentile), rounded to the "best" integer value.
- Throws:
NullPointerException
- if histogram argument is null.IllegalArgumentException
- if Double.isNaN(rank).- See Also:
-
iPreciseValue
public static int iPreciseValue(int[] histogram, double rank) Precise equivalent ofiPreciseValue(long[], double)
for a case of int[] type of the histogram.- Parameters:
histogram
- histogram[k] is the number of elements in the source floating-point array that are "almost equal" to k. All histogram[k] must be non-negative; in other case, IllegalArgumentException can be thrown (but also can be not thrown).rank
- the index in the source array (if non-integer, this method interpolates nearest elements).- Returns:
- interpolated value of the found element (percentile), rounded to the "best" integer value.
- Throws:
NullPointerException
- if histogram argument is null.IllegalArgumentException
- if Double.isNaN(rank).- See Also:
-
preciseValue
public static double preciseValue(long[] histogram, double rank) "Interpolated" version ofiValue(long[], long)
. This function supposes that the histogram is built on an array of floating-point values after truncating them to an integer value (long)value.More precisely, we suppose that if b[k]==b (here and below b[k]=histogram[k]), it means that the source floating-point array contains b values
k+j/b, j=0,1,...,b−1 . With this suggestion, this method finds the element of the source array v1 with the index #r1=⌊rank⌋=(long)rank and the element of the source array v2 with the index#r2=r1+1 . Obviously,v1 = v0+(r1-r0)/b[v0] , where v0 is the minimal integer value so thatb[0]+b[1]+...+b[v0]>r1 andr0=b[0]+b[1]+...+b[v0−1] , and there is the analogous formula for v2. Note:v2=v1+1/b[v0] ifr2<r0+b[v0] or ifr2=r0+b[v0] and the next bar is non-zero:b[v0+1]>0 ; in other case, v2 is the minimal integer >v0 so thatb[v2]>0 .After finding v1 and v2, this method returns the value interpolated between them:
v1 + (rank−r1) * (v2−v1) . Note: if rank argument is integer (rank==(long)rank ), this method does not try to find v2 and just returns v1.As
iValue(long[], long)
, if rank<0, this method returnsmin (k∈Z: b[k]>0) (the minimal element in the source array), and ifrank≥(sum of all b[k]) , it returnsmax (k∈Z: b[k]>0)+1 (for floating-point array, it means the maximal element plus1/b[k] ). Ifrank>(sum of all b[k])−1 , butrank<(sum of all b[k]) , then in formulas above there is no element v2 with the index#r2=r1+1 ; in this case, it is supposedv2=v1+1 (the maximal element of the floating-point array plus1/b[k] ,k=v1 ). If all columnsb[k] are zero (no elements), this method returns 0.Please compare the described behaviour with little more simple behaviour of
value(long[], double)
method.The result of this method is equal to the percentile v(r) for the passed r=rank in terms of the precise histogram model: see
comments to Histogram class
for more details.- Parameters:
histogram
- histogram[k] is the number of elements in the source array that are "almost equal" to k. All histogram[k] must be non-negative; in other case, IllegalArgumentException can be thrown (but also can be not thrown).rank
- the index in the source array (if non-integer, this method interpolates nearest elements).- Returns:
- interpolated value of the found element (percentile).
- Throws:
NullPointerException
- if histogram argument is null.IllegalArgumentException
- if Double.isNaN(rank).- See Also:
-
preciseValue
public static double preciseValue(int[] histogram, double rank) Precise equivalent ofpreciseValue(long[], double)
for a case of int[] type of the histogram.- Parameters:
histogram
- histogram[k] is the number of elements in the source array that are "almost equal" to k. All histogram[k] must be non-negative; in other case, IllegalArgumentException can be thrown (but also can be not thrown).rank
- the index in the source array (if non-integer, this method interpolates nearest elements).- Returns:
- interpolated value of the found element (percentile).
- Throws:
NullPointerException
- if histogram argument is null.IllegalArgumentException
- if Double.isNaN(rank).- See Also:
-
percentile
public static double percentile(long[] histogram, long sumOfColumns, double percentileLevel) Equivalent topreciseValue
(histogram,percentileLevel*(sumOfColumns-1)), if sumOfColumns>0. If sumOfColumns==0, this method immediately returns 0.0; if sumOfColumns<0, this method throws an exception.If is supposed that sumOfColumns is the sum of all histogram elements. If it is true, the returned value is usually called the percentile of the source array with the level specified by percentileLevel argument. If percentileLevel==0.5, this value is also called the median of the source array. But, of course, you can pass any positive value as sumOfColumns: in any case, if sumOfColumns>0, this method returns the result of
preciseValue
(histogram,percentileLevel*(sumOfColumns-1)).- Parameters:
histogram
- histogram[k] is the number of elements in the source array that are "almost equal" to k. All histogram[k] must be non-negative; in other case, IllegalArgumentException can be thrown (but also can be not thrown).sumOfColumns
- should be equal to sum of all histogram elements (in other words, the length of the source array).percentileLevel
- the percentile level (usually from 0.0 to 1.0).- Returns:
- interpolated value of the found element (percentile).
- Throws:
NullPointerException
- if histogram argument is null.IllegalArgumentException
- if Double.isNaN(percentileLevel) or if sumOfColumns<0.- See Also:
-
percentile
public static double percentile(int[] histogram, long sumOfColumns, double percentileLevel) Equivalent topreciseValue
(histogram,percentileLevel*(sumOfColumns-1)), if sumOfColumns>0. If sumOfColumns==0, this method immediately returns 0.0; if sumOfColumns<0, this method throws an exception.If is supposed that sumOfColumns is the sum of all histogram elements. If it is true, the returned value is usually called the percentile of the source array with the level specified by percentileLevel argument. If percentileLevel==0.5, this value is also called the median of the source array. But, of course, you can pass any positive value as sumOfColumns: in any case, if sumOfColumns>0, this method returns the result of
preciseValue
(histogram,percentileLevel*(sumOfColumns-1)).- Parameters:
histogram
- histogram[k] is the number of elements in the source array that are "almost equal" to k. All histogram[k] must be non-negative; in other case, IllegalArgumentException can be thrown (but also can be not thrown).sumOfColumns
- should be equal to sum of all histogram elements (in other words, the length of the source array).percentileLevel
- the percentile level (usually from 0.0 to 1.0).- Returns:
- interpolated value of the found element (percentile).
- Throws:
NullPointerException
- if histogram argument is null.IllegalArgumentException
- if Double.isNaN(percentileLevel) or if sumOfColumns<0.- See Also:
-
sumOf
public static long sumOf(long[] histogram) Returns the sum of all elements of the passed histogram array.- Parameters:
histogram
- any array (for example, a histogram).- Returns:
- the sum of all elements of the passed array.
- Throws:
NullPointerException
- if histogram argument is null.
-
sumOf
public static int sumOf(int[] histogram) Returns the sum of all elements of the passed histogram array.- Parameters:
histogram
- any array (for example, a histogram).- Returns:
- the sum of all elements of the passed array.
- Throws:
NullPointerException
- if histogram argument is null.
-
clear
public static void clear(int[] histogram, int sumOfColumns) Fills all non-zero elements of histogram by 0.Works faster then trivial loop for all elements histogram[0..histogram.length-1], if most of last elements of histogram array already contain zero.
- Parameters:
histogram
- cleared histogram.sumOfColumns
- should be equal to sum of all histogram elements.
-