This page is no longer maintained — Please continue to the home page at www.scala-lang.org

Efficient and smart storage of time series

2 replies
edmondo1984
Joined: 2011-09-14,
User offline. Last seen 28 weeks 3 days ago.
Dear all,
I have the following use case, and I would like to hear your suggestions.

I have to store data in t,y where t is a time instant and y is the value of y=f(t)

In a simple case, since my t where equi-distant in time, I could store that efficiently in an array.

class Data(values:Array[Double], pointsFrequency:Int) {

final def apply(month:Int) = values(month/pointsFrequency);

}


Imagine now I have the following case: for low t I want to store very frequent data, for higher t I want to store less frequent data.

I end up in having a complexData

class ComplexData(subdata:IndexedSeq[Data]) {

final def apply(month:Int)

}

What is the best implementation you can imagine ? :)

Best Regards


Tim P
Joined: 2011-07-28,
User offline. Last seen 1 year 4 weeks ago.
Re: Efficient and smart storage of time series

Hi Edmondo
Important questions that would help understand what you want
a) how much data are we talking about
b) how do you process it (sequentially, random search by time interval ...)
c) how space efficient or fast does it really need to be?
d) are you accessing all the values or just sampling
e) what exactly do you mean by low t and high t in
> for low t I want to store very
> frequent data, for higher t I want to store less frequent data.

On 10 January 2012 10:21, Edmondo Porcu wrote:
> Dear all,
> I have the following use case, and I would like to hear your suggestions.
>
> I have to store data in t,y where t is a time instant and y is the value of
> y=f(t)
>
> In a simple case, since my t where equi-distant in time, I could store that
> efficiently in an array.
>
> class Data(values:Array[Double], pointsFrequency:Int) {
>
> final def apply(month:Int) = values(month/pointsFrequency);
>
> }
>
>
> Imagine now I have the following case: for low t I want to store very
> frequent data, for higher t I want to store less frequent data.
>
> I end up in having a complexData
>
> class ComplexData(subdata:IndexedSeq[Data]) {
>
> final def apply(month:Int)
>
> }
>
> What is the best implementation you can imagine ? :)
>
> Best Regards
>
>

Sciss
Joined: 2008-12-17,
User offline. Last seen 28 weeks 5 days ago.
Re: Efficient and smart storage of time series

if you are willing to read through CS papers, maybe start from something like this

http://scholar.google.com/scholar?hl=en&q=Fast%2C+small-space+algorithms...

to gather ideas.

it all depends on your population size and the performance requirements.

obviously you need some sort of subsampling. if your distribution is approximately logarithmic that's probably the easiest to codify.

best, -sciss-

On 10 Jan 2012, at 10:21, Edmondo Porcu wrote:

> Dear all,
> I have the following use case, and I would like to hear your suggestions.
>
> I have to store data in t,y where t is a time instant and y is the value of y=f(t)
>
> In a simple case, since my t where equi-distant in time, I could store that efficiently in an array.
>
> class Data(values:Array[Double], pointsFrequency:Int) {
>
> final def apply(month:Int) = values(month/pointsFrequency);
>
> }
>
>
> Imagine now I have the following case: for low t I want to store very frequent data, for higher t I want to store less frequent data.
>
> I end up in having a complexData
>
> class ComplexData(subdata:IndexedSeq[Data]) {
>
> final def apply(month:Int)
>
> }
>
> What is the best implementation you can imagine ? :)
>
> Best Regards
>
>

Copyright © 2012 École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland