Useinfo Controls

Specification

Proposal:

  • 8 > useinfo(0) > 0 (some qc has been done)
  • useinfo(1) = 0 (ok obs time)
  • useinfo(2) < 2 (original is ok or probably correct)
  • useinfo(3) < 3 (original unchanged and manually corrected or interpolated)

Discussion:

The condition on useinfo(0) and useinfo(2) looks ok. Disregarding observations with useinfo(2)=2 (“Very suspicious”) sounds sensible (although not self-evident). If useinfo(2) < 2, then useinfo(3) will necessarily be 0, so no need to look at useinfo(3).

Suggest not to take useinfo(1) into account either. For some observation types arriving 11 minutes after termin time is enough to get useinfo(1)=1, which in fact therefore is a very common value. We could, however, consider looking instead at useinfo(7), and reject the observation if “utført for tidlig eller sent”: useinfo(7)=1,2,5,6. Someone else than me has to decide whether these flag values should disqualify the observation from being used in the QC2 interpolation. But this can wait, since these flag values are very rare - no examples during last 2 days.

Sample cfg file constructed

kvalobs@pak:~/etc/kvalobs/Qc2Config$ cat 211-SingleLinear.cfg 
#[Specific Runs]
#Last 24hours .....
#Daily Run ....
# /disk1/KVALOBS/Qc2Config .... $KVALOBS

# Performs simple linear interpolation for a single point replacement.
AlgoCode=10
#InterpCode=9   # 9
RunAtHour=23
RunAtMinute=18

#[Time Range]
Start_YYYY=2036
Start_MM=2
Start_DD=28
Start_hh=0
Start_mm=0
Start_ss=0

End_YYYY=2036
End_MM=4
End_DD=1
End_hh=0
End_mm=0
End_ss=0

#[Time Step]
#Step_hh=12 

#[Specific Data Type and Paramters ids etc.]
ParamId=211
MaxParamId=215
MinParamId=213
#InterpolationDistance=50.0
# Only write back the result if not prviously controlled
W_fhqc=0

#Flag to set if value is corrected
S_ftime=1
change_fmis=3->1
change_fmis=0->4
#Flag to check if the algorithm shall be applied ...
Not_ftime=1   # ftime=1 is set if the algorithm is run. Do not use an interpolated value for another interpolation.
Not_fnum=6    # This is to cut out the data with large values
#NotU_0=7
U_0=1
U_0=2
U_0=3
U_0=4
U_0=5
U_0=6
U_0=7
NotU_2=2
#NotU_7=1
#NotU_7=2
#NotU_7=5
#NotU_7=6
kvalobs@pak:~/etc/kvalobs/Qc2Config$ 

Results

NB The above is on canned test data (same as below) now need to run on real setup.

Tests On Other Parameters

All the following used “AlgoCode=10” except for paramid=212 that used “AlgoCode=9”. Ref: Algorithm summary: https://kvalobs.wiki.met.no/doku.php?id=kvoss:system:qc2:user:algorithms

112 173 211 212 213 215 262

The above results were obtained using the following test data

Tests may be conducted thus:

Clean old data from data-base DELETE FROM data WHERE obstime>'2036-02-28' and obstime<'2036-04-01';
Load data …
 \copy data FROM 'multiparamtersinglemissingpointtest.dat' WITH DELIMITER AS '|'
Run algorithms e.g. example configuration files
https://svn.met.no/kvoss/kvQc2/branches/kvqc2-1.0.1/src/Reference/211.cfg
https://svn.met.no/kvoss/kvQc2/branches/kvqc2-1.0.1/src/Reference/212.cfg
Extract results
kvalobs=# select * from data where paramid=211 and obstime between '2036-02-28' and '2036-03-05' and cfailed like '%QC2d-2%'  \g | cat >> ./211.dat; 
kvalobs=# select * from data where paramid=212 and obstime between '2036-02-28' and '2036-03-05' and cfailed like '%QC2d-2%'  \g | cat >> ./212.dat;
Plot, analyse results …

Single Missing Point

First Algorithm

The first single missing point algorithm addressed hourly temperature and provided a corrected value by taking the average of the maximum and minimum observed temperature for that hour recorded at the following time stamp (as described here).

Test data has been taken from March 2008. On each day for all stations the “original” hourly temperature has been set to a missing value -32767 for the times 05:00, 11:00, 17:00 and 23:00 UT and the original value is kept for reference. The results of the algorithm are then compared to the original data:

ZOOM

Good results are obtained. Note the points at -99.9 arise due to this value being present in the original data record (and they are not a result of the algorithm).

Generalisation

The algorithm has now been generalised (so-called SIngleMissingPoint):

  • The paramids for the value to be corrected and the minimum and maximum values are now specified in the configuration file. This means the same algorithm may be applied to different parameters as appropriate (each case still to be tested).
  • If no Max and Min value are available, or if these are just not declared in the config file, the algorithm performs a linear interpolation instead. (in future, the interpolation method can also be selected, e.g. when an akima routine has been finaliused etc).

Running the new algorithm on the same data as above, the algorithm first tries to do a Max and Min average. If this is not possible a linear interpolation is performed across the gap. First results:

ZOOM

It is clear that in the above graph there are additional points that also show good agreement with the original values. There are also many original values that are unphysical (e.g. high temperatures in March and filtering on flags has also to be applid so that these cases are not included). Examples of high temperatures in the database:

     18020 | 2008-03-10 03:00:00 |    58.78 |     211 | 2008-03-10 06:07:09 |      4 | 0      |     0 |       2.8 | 1000600000000610 | 7033700000000002 | QC1-0-211:1,QC1-4-211:1
     18020 | 2008-03-10 04:00:00 |     57.4 |     211 | 2008-03-10 06:07:09 |      4 | 0      |     0 |       2.9 | 1000600000000610 | 7033700000000002 | QC1-0-211:1,QC1-4-211:1
     18020 | 2008-03-10 05:00:00 |    56.02 |     211 | 2008-03-10 06:07:09 |      4 | 0      |     0 |       3.1 | 1000600000000610 | 7033700000000002 | QC1-0-211:1,QC1-4-211:1

Ok, lets filter out results where control(4)=6 … fnum=6. This gives the following improved result, but the algorithm still appears to act on some temperatures that are too high and should not be corrected by interpolating between the times before and after, that are also too high!?!

Configuration File Used:

#[Specific Runs]
#Last 24hours .....
#Daily Run ....
# /disk1/KVALOBS/Qc2Config .... $KVALOBS

# Performs simple linear interpolation for a single point replacement.
AlgoCode=9
#InterpCode=9   # 9
RunAtHour=4 
RunAtMinute=58   

#[Time Range]
Start_YYYY=2036  #NB in the test data I set all 2008->2036 for convenience ...
Start_MM=2
Start_DD=28
Start_hh=0
Start_mm=0
Start_ss=0

End_YYYY=2036
End_MM=4
End_DD=1
End_hh=0
End_mm=0
End_ss=0

#[Time Step]
#Step_hh=12 

#[Specific Data Type and Paramters ids etc.]
ParamId=211
MaxParamId=215
MinParamId=213
#InterpolationDistance=50.0
# Only write back the result if not prviously controlled
W_fhqc=0

#Flag to set if value is corrected
S_ftime=1
change_fmis=3->1
change_fmis=0->4
#Flag to check if the algorithm shall be applied ...
A_ftime=1   #
A_fnum=6    # This is to cut out the data with large values

New Specification

A. TA(t), TAN(t) and TAX(t) are missing

TA(t)= 0.5 * [TA(t-1) + TA(t+1)]     Linear interpolation

B. TA(t) is missing, TAN(t) and TAX(t) exist

Still we use TA(t)= 0.5 * [TA(t-1) + TA(t+1)], but
if TA(t) > TAX(t) --> TA(t) = TAX(t)
if TA(t) < TAN(t) --> TA(t) = TAN(t)

Controlinfo to check …

1. Same typeid (in 2008 one station (59110) had typeid=3 and typeid=330)
Perhaps this is not a problem to day (cleaning obs_pgm?)

2. stationid < 100000 (we have no control on foreign stations)

3. fnum=6 (as you already have done). But if fnum is not run
I propose fpre=6-7 and fhqc=A as well

Applying this algorithm and controls:

4. Controlinfo(0-4)=00000 or useinfo(0-4)=99999 (No control at all)

Applying the filter: do not include fhqclevel=0, gives:

The corresponding configuration file to produce the above result:

# Performs simple linear interpolation for a single point replacement.
AlgoCode=10
#InterpCode=9   # 9
RunAtHour=0 
RunAtMinute=42   

#[Time Range]
Start_YYYY=2036
Start_MM=2
Start_DD=28
Start_hh=0
Start_mm=0
Start_ss=0

End_YYYY=2036
End_MM=4
End_DD=1
End_hh=0
End_mm=0
End_ss=0

#[Time Step]
#Step_hh=12 

#[Specific Data Type and Paramters ids etc.]
ParamId=211
MaxParamId=215
MinParamId=213
#InterpolationDistance=50.0
# Only write back the result if not prviously controlled
W_fhqc=0

CfailedString="MIST"

#Flag to set if value is corrected
S_ftime=1
change_fmis=3->1
change_fmis=0->4
#Flag to check if the algorithm shall be applied ... i.e. the following are restrictions:
A_ftime=1   #
A_fnum=6    # This is to cut out the data with large values
A_fpre=6
A_fpre=7
A_fhqc=A
#V_fpre=1
#V_fpre=2
A_fqclevel=0

TODO: Use obs_pgm to generate better station list … (currently all stations used)

CFAILED settings

Forslag til innhold i cfailed:

Minimum bør det skrives “QC2d-2” til cfailed. Det vil antagelig komme flere algoritmer for tidsserietilpasning, så “QC2d-2 MIST” er bedre. Er det korrekt antatt at algoritmen bare setter ftime=1, og ikke ftime=2 og ftime=3? I motsatt fall kan det være grunnlag for å skrive mere i cfailed for disse tilfellene.

2010/06/20 09:25

This website uses cookies. By using the website, you agree with storing cookies on your computer. Also you acknowledge that you have read and understand our Privacy Policy. If you do not agree leave the website.More information about cookies
  • kvalobs/kvoss/system/qc2/test/algorithms/singlemissingpoint.txt
  • Last modified: 2022-05-31 09:29:32
  • (external edit)