- About Scala
- Documentation
- Code Examples
- Software
- Scala Developers
Efficient and smart storage of time series
Tue, 2012-01-10, 11:21
Dear all,
I have the following use case, and I would like to hear your suggestions.
I have to store data in t,y where t is a time instant and y is the value of y=f(t)
In a simple case, since my t where equi-distant in time, I could store that efficiently in an array.
class Data(values:Array[Double], pointsFrequency:Int) {
final def apply(month:Int) = values(month/pointsFrequency);
}
Imagine now I have the following case: for low t I want to store very frequent data, for higher t I want to store less frequent data.
I end up in having a complexData
class ComplexData(subdata:IndexedSeq[Data]) {
final def apply(month:Int)
}
What is the best implementation you can imagine ? :)
Best Regards
I have the following use case, and I would like to hear your suggestions.
I have to store data in t,y where t is a time instant and y is the value of y=f(t)
In a simple case, since my t where equi-distant in time, I could store that efficiently in an array.
class Data(values:Array[Double], pointsFrequency:Int) {
final def apply(month:Int) = values(month/pointsFrequency);
}
Imagine now I have the following case: for low t I want to store very frequent data, for higher t I want to store less frequent data.
I end up in having a complexData
class ComplexData(subdata:IndexedSeq[Data]) {
final def apply(month:Int)
}
What is the best implementation you can imagine ? :)
Best Regards
Tue, 2012-01-10, 11:51
#2
Re: Efficient and smart storage of time series
if you are willing to read through CS papers, maybe start from something like this
http://scholar.google.com/scholar?hl=en&q=Fast%2C+small-space+algorithms...
to gather ideas.
it all depends on your population size and the performance requirements.
obviously you need some sort of subsampling. if your distribution is approximately logarithmic that's probably the easiest to codify.
best, -sciss-
On 10 Jan 2012, at 10:21, Edmondo Porcu wrote:
> Dear all,
> I have the following use case, and I would like to hear your suggestions.
>
> I have to store data in t,y where t is a time instant and y is the value of y=f(t)
>
> In a simple case, since my t where equi-distant in time, I could store that efficiently in an array.
>
> class Data(values:Array[Double], pointsFrequency:Int) {
>
> final def apply(month:Int) = values(month/pointsFrequency);
>
> }
>
>
> Imagine now I have the following case: for low t I want to store very frequent data, for higher t I want to store less frequent data.
>
> I end up in having a complexData
>
> class ComplexData(subdata:IndexedSeq[Data]) {
>
> final def apply(month:Int)
>
> }
>
> What is the best implementation you can imagine ? :)
>
> Best Regards
>
>
Hi Edmondo
Important questions that would help understand what you want
a) how much data are we talking about
b) how do you process it (sequentially, random search by time interval ...)
c) how space efficient or fast does it really need to be?
d) are you accessing all the values or just sampling
e) what exactly do you mean by low t and high t in
> for low t I want to store very
> frequent data, for higher t I want to store less frequent data.
On 10 January 2012 10:21, Edmondo Porcu wrote:
> Dear all,
> I have the following use case, and I would like to hear your suggestions.
>
> I have to store data in t,y where t is a time instant and y is the value of
> y=f(t)
>
> In a simple case, since my t where equi-distant in time, I could store that
> efficiently in an array.
>
> class Data(values:Array[Double], pointsFrequency:Int) {
>
> final def apply(month:Int) = values(month/pointsFrequency);
>
> }
>
>
> Imagine now I have the following case: for low t I want to store very
> frequent data, for higher t I want to store less frequent data.
>
> I end up in having a complexData
>
> class ComplexData(subdata:IndexedSeq[Data]) {
>
> final def apply(month:Int)
>
> }
>
> What is the best implementation you can imagine ? :)
>
> Best Regards
>
>