30 000 точек данных, найдите наибольшее изменение за 2 недели

Question

Jun 15, 2012, 05:18 PM

30 000 точек данных, найдите наибольшее изменение за 2 недели

Я имею:

- 30,000 data points
- each data point is a measurement of type float
- each measurement is associated with a date
- each date has only one measurement
- no dates are without measurements
- the data comes in the form of a text file: 30,000 lines in this form:
    - YYYY-MM-DD I,F (e.g. 1977-02-08 20.74)
- measurement appearing in the source file are already sorted by date

Я нуждаюсь:

- a time-interval T with boundaries (s,e) /* start, end */
- (s - e = 14 days) the time-interval *must* be 2 weeks
- define min as the lowest value in the interval T
- define max as the greatest value in the interval T
- the chosen T needs to have the greatest distance btwn max and min of all possible Ts
- break ties among intervals T by choosing the most recent (with the greatest s value)
- the chosen T must consider all jumps in the 14 days, not just the values @ s and e
- if the overall "variance" in the interval is great but the jump 
  |max-min| is not the greatest in absolute value, T is not the right choice,
  even if it's an "exciting" interval

Я спрашиваю:

- which algorithm to employ, considering algorithms are not my specialty
- which data structure to use to keep track of the subtotals

Замечания:

- an answer in pseudo code would be preferred, "prose" is fine if pressured for time
- an answer in Python would be... splendid :)

Если вы хотите, вы можете сгенерировать «пустышку» данные и запустить предложенный алгоритм в качестве теста, или я мог бы поделиться фактическими данными.

Я не так сильно озабочен производительностью, за исключением желания узнать самый быстрый способ сделать это, чтобы научиться применять правильное решение и правильный алгоритм.

Я думаю, что могу "доказать" корректность даже с помощью самого простого итеративного алгоритма, потому что набор данных невелик для современных компьютеров.

До сих пор я "прохожу и продолжаю 14 векторов из 14 измерений", если бы вы могли научить меня, как делать это постепенно с под-суммами, это было бы очень полезно.

30 000 точек данных, найдите наибольшее изменение за 2 недели

Ответы на вопрос(2)

Ваш ответ на вопрос

Популярные вопросы

Вы очень активны! Это здорово!

30 000 точек данных, найдите наибольшее изменение за 2 недели

Ответы на вопрос(2)

Ваш ответ на вопрос

Популярные вопросы