class Logging::Stats::Sampler
A very simple little class for doing some basic fast statistics sampling. You feed it either samples of numeric data you want measured or you call #tick to get it to add a time delta between the last time you called it. When you're done either call sum, sumsq, num, min, max, mean or sd to get the information. The other option is to just call #to_s and see everything.
It does all of this very fast and doesn't take up any memory since the samples are not stored but instead all the values are calculated on the fly.
Attributes
Public Class Methods
Class method that returns the headers that a CSV file would have for the values that this stats object is using.
# File lib/logging/stats.rb, line 88 def self.keys %w[name sum sumsq num mean sd min max] end
Create a new sampler.
# File lib/logging/stats.rb, line 22 def initialize( name ) @name = name reset end
Public Instance Methods
Coalesce the statistics from the other sampler into this one. The other sampler is not modified by this method.
Coalescing the same two samplers multiple times should only be done if one of the samplers is reset between calls to this method. Otherwise statistics will be counted multiple times.
# File lib/logging/stats.rb, line 47 def coalesce( other ) @sum += other.sum @sumsq += other.sumsq if other.num > 0 @min = other.min if @min > other.min @max = other.max if @max < other.max @last = other.last end @num += other.num end
You can just call tick repeatedly if you need the delta times between a set of sample periods, but many times you actually want to sample how long something takes between a start/end period. Call mark at the beginning and then tick at the end you'll get this kind of measurement. Don't mix mark/tick and tick sampling together or the measurement will be meaningless.
# File lib/logging/stats.rb, line 124 def mark @last_time = Time.now.to_f end
Calculates and returns the mean for the data passed so far.
# File lib/logging/stats.rb, line 99 def mean return 0.0 if num < 1 sum / num end
Resets the internal counters so you can start sampling again.
# File lib/logging/stats.rb, line 29 def reset @sum = 0.0 @sumsq = 0.0 @num = 0 @min = 0.0 @max = 0.0 @last = nil @last_time = Time.now.to_f self end
Adds a sampling to the calculations.
# File lib/logging/stats.rb, line 60 def sample( s ) @sum += s @sumsq += s * s if @num == 0 @min = @max = s else @min = s if @min > s @max = s if @max < s end @num += 1 @last = s end
Calculates the standard deviation of the data so far.
# File lib/logging/stats.rb, line 106 def sd return 0.0 if num < 2 # (sqrt( ((s).sumsq - ( (s).sum * (s).sum / (s).num)) / ((s).num-1) )) begin return Math.sqrt( (sumsq - ( sum * sum / num)) / (num-1) ) rescue Errno::EDOM return 0.0 end end
Adds a time delta between now and the last time you called this. This will give you the average time between two activities.
An example is:
t = Sampler.new("do_stuff") 10000.times { do_stuff(); t.tick } t.dump("time")
# File lib/logging/stats.rb, line 137 def tick now = Time.now.to_f sample(now - @last_time) @last_time = now end
An array of the values: [name,sum,sumsq,num,mean,sd,min,max]
# File lib/logging/stats.rb, line 81 def to_a [name, sum, sumsq, num, mean, sd, min, max] end
# File lib/logging/stats.rb, line 92 def to_hash {:name => name, :sum => sum, :sumsq => sumsq, :num => num, :mean => mean, :sd => sd, :min => min, :max => max} end
Returns statistics in a common format.
# File lib/logging/stats.rb, line 75 def to_s "[%s]: SUM=%0.6f, SUMSQ=%0.6f, NUM=%d, MEAN=%0.6f, SD=%0.6f, MIN=%0.6f, MAX=%0.6f" % to_a end