====== Qc2 Flag Analysis and Discussion ====== Note: a number of Qc2 useinfo changes are coded in lib/kvalobs/kvDataFlag.cc but are all commented out and inactive. For development and reflectedin the examples provided below, the Qc2 items are all turned on. Please see [[#Turn on Qc2 in useinfo?|below]] for details. ===== Existing Specification ===== A general guide to setting values in Qc2 Control Flags is described in the following set of slides from Lars. The current algorithms and proposals for new algorithms are based on this: * [[https://kvalobs.wiki.met.no/lib/exe/fetch.php?id=kvoss%3Asystem%3Aqc2%3Arequirements&cache=cache&media=kvoss:system:qc2:flags_in_qc2.ppt|Flags in Qc2]] ==== Redistribution of Accumulated Values ==== ^Controlflag Setting: |fd=7 | ^Condition: |The algorithm is successful. | ^Cfailed: |Updated to include algorithm name and type of interpolation used to generate the correction | ^ |(Also possible to use fd=A, fd=B, ... for other methods of redistribution, but such information can also always be kept in cfailed.) | **Examples** An example of the redistribution of accumulated precipitation. Where successful estimation of corrected values occurs, the controlinfo(12), fd is set to 7. The useinfo is then updated in accordance with the updated routine [[#Turn on Qc2 in useinfo?|above]]. BEFORE Qc2 stationid | obstime | original | paramid | tbtime | typeid | sensor | level | corrected | controlinfo | useinfo | cfailed -----------+---------------------+----------+---------+---------------------+--------+--------+-------+-----------+------------------+------------------+------------------------------------------- 53950 | 2007-12-07 06:00:00 | 9 | 110 | 2007-12-07 12:10:30 | 302 | 0 | 0 | 9 | 1140000000001000 | 7020400000000001 | QC1-2-72.b12:1,QC1-2-72.c12:1 53950 | 2007-12-08 06:00:00 | -32767 | 110 | 2007-12-09 00:36:39 | 302 | 0 | 0 | -32767 | 1000003000002000 | 7899900000000000 | QC1-7-110:1 53950 | 2007-12-09 06:00:00 | -32767 | 110 | 2007-12-10 00:40:02 | 302 | 0 | 0 | -32767 | 1000003000002000 | 7899900000000000 | QC1-7-110:1 53950 | 2007-12-10 06:00:00 | 1.5 | 110 | 2007-12-10 11:25:20 | 302 | 0 | 0 | 1.5 | 1140000000002000 | 7330900000000001 | QC1-2-72.b12:1,QC1-2-72.c12:1,QC1-7-110:1 (4 rows) AFTER Qc2 kvalobs=# select * from data where stationid=53950 and obstime>'2007-12-07' and obstime<'2007-12-11' and paramid=110; stationid | obstime | original | paramid | tbtime | typeid | sensor | level | corrected | controlinfo | useinfo | cfailed -----------+---------------------+----------+---------+---------------------+--------+--------+-------+-----------+------------------+------------------+----------------------------------------------------------------------- 53950 | 2007-12-07 06:00:00 | 9 | 110 | 2007-12-07 12:10:30 | 302 | 0 | 0 | 9 | 1140000000001000 | 7020400000000001 | QC1-2-72.b12:1,QC1-2-72.c12:1 53950 | 2007-12-08 06:00:00 | -32767 | 110 | 2007-12-09 00:36:39 | 302 | 0 | 0 | 0.1 | 1000003000007000 | 5899900000000000 | QC1-7-110:1 Qc2 Redis corrected was:-32767 53950 | 2007-12-09 06:00:00 | -32767 | 110 | 2007-12-10 00:40:02 | 302 | 0 | 0 | 0 | 1000003000007000 | 5899900000000000 | QC1-7-110:1 Qc2 Redis corrected was:-32767 53950 | 2007-12-10 06:00:00 | 1.5 | 110 | 2007-12-10 11:25:20 | 302 | 0 | 0 | 1.4 | 1140000000007000 | 5336900000000001 | QC1-2-72.b12:1,QC1-2-72.c12:1,QC1-7-110:1 Qc2 Redis corrected was:1.5 (4 rows) NOTES: * The operational deployed version of Qc2 will not state "Redis corrected was XX" in cfailed. This is just a convenience for ongoing testing. * More examples of the control- and useinfo- behaviour are available in the [[https://kvalobs.wiki.met.no/lib/exe/fetch.php?id=kvoss%3Asystem%3Aqc2%3Aprototype&cache=cache&media=kvoss:system:qc2:qc2report20080623.doc|Prototype Report]]. The only difference is that at the time of the protoype report fd=8 was used as the controlinfo flag value. ==== TAN-TAX Interpolation ==== Interpolation of single missing temperature values using the average of TAN and TAX corresponding to the same time interval. ^Controlflag Setting: |ftime=1 | ^Condition: |The algorithm is successful | ^Cfailed: |Algorithm method recorded. | **Examples** Sample data before Qc2 59680|2008-03-19 05:00:00|-1.7 |211|2008-03-19 04:54:05|311|0|0|-1.7|1111100000000010|7000000000000000| 59680|2008-03-19 06:00:00|-32767|211|2008-03-19 06:32:48|311|0|0|-1.7|1000601000000007|3891900000000021|QC1-4-211:1,hqc 59680|2008-03-19 07:00:00|-0.6 |211|2008-03-19 06:54:00|311|0|0|-0.6|1110100000000010|7000000000000000| After the Qc2 algorithm the row with the missing line is updated as follows: 59680|2020-03-19 06:00:00|-32767|211|2020-03-19 06:32:48|311|0|0|-0.7|1000601100000007|1891900000000021|QC1-4-211:1,hqc Qc2 UnitT corrected was:-1.7 **Discussion** (a summary of inputs from many different people, identities removed since this is the public wiki) * ftime should influence useinfo(3) and useinfo(4), propose the useinfo algorithms need a complete redraft once we have an overview of the QC2 features. ((This page and associated links try to answer this request)). * ftime is part of the useinfo(4) requirements but needs to be checked/tested * ftime is not yet included in the requirements for useinfo(3). This needs to be done * In this particular example fhqc=7 "Korrigert manuelt" which must be a bug, since original value is missing (fhqc=5: "Interpolert manuelt" might have been an appropriate setting). So the misleading value useinfo(3)=1: "Original verdi er manuelt korrigert" is because of this mistake in fhqc. * Currently for the operational configuration the condition set is "DO NOT OVERWRITE A VALUE ALREADY CORRECTED BY HQC". For development/testing this condition is sometimes relaxed. * since HQC has put in a value in the corrected field (or possibly confirmed the value already inserted there by QC1-4): should it really be possible for QC2 to change that later? * Reasons for "no" include: that elsewhere setting up the criteria for useinfo(3) will be that much harder - it is no longer enough to check controlinfo, we also need to know which control was performed last (we would have useino(3) = 4 or 2 depending on HQC was performed before or after QC2). * Reasons for "yes": QC2 is objective and may be preferred to the subjective HQC, to get uniform results. At least some of the algorithms could be set up to overwrite the HQC results. Redistribution of accumulated precipitation is one of the candidates - at least for "old" data in Histkvalobs. **Scheduling** Whereas the default operation is that Qc2 shall not overwrite HQC results the need that this is ever considered shall be minimised by always having Qc2 before HQC. Even so there may be situations that new data is available and Qc2 is being rerun that Qc2 may overwrite HQC. When this case arises the rules to be followed have to specified. Also in the case when Qc2 is delayed this may happen: - 06 observation is missing. fmis=3 changes to fmis=1 when QC1 puts in model value, Tmod, and we get fnum=6 and fhqc=0. ftime=0 and useinfo=78947. - HQC changes model value from Tmod to -1.7. fmis=1, fnum=6 (unchanged), fhqc=5 (manual interpolation). ftime=0 and useinfo=38929. - QC2 changes "HQC-value" to -0.8 (or -0.75?). **We must discuss if we will allow this.** But if we do, fmis and fnum are unchanged. fhqc is disregarded. Then fhqc=0. But fhqc should always be the last control. In this case it will stay as 0 (if no HQC) or be changed to 1 (probably, but 5 is also possible, if HQC). At this stage ftime=1 and useinfo=58946. ==== Space Check ==== In a **Space Check** an independent estimate of a parameter measured at a station is derived from a set of neighbouring values. Comparison with the estimate is used to assign a confidence level to the original parameter observation. The estimate can be considered to be a "model" value and therefore the higher setting of the fnum controlflag (7-A) may be used to indicate the performance of this quality control. If the Space Check result is that there is agreement between the observed measurement and the model value then fnum is set to 1 as usual. ^Controlflag Setting: |fnum=0-6 |fnum=7-A | ^Condition: |As already defined for numerical model runs. |Qc2 Values For Space Check | ^Cfailed: | Specific details of the SpaceCheck performed, e.g. type of check and interpolation method etc. || | --oo0oo-- ||| |fnum=7 | Controlled. Deviation between observed value and SpaceCheck estimate higher than high test value || |fnum=8 | Controlled. Deviation between observed value and SpaceCheck estimate lower than low test value || |fnum=9 | Controlled. Deviation between observed value and SpaceCheck estimate higher than highest test value|| |fnum=A | Controlled. Deviation between observed value and SpaceCheck estimate lower than lowest test value || fnum [0-6] are set by [[https://kvalobs.wiki.met.no/doku.php?id=kvalobs:qc14|Qc1-4]] based on comparison with numerical model values. Qc1-4 will run before the Qc2-SpaceCheck. **Open issues:** - Once the SpaceCheck algorithm is validated and demonstrated to be effective shall the SpaceCheck result always take precedence over Qc1-4? - Is there need to preserve the results of both Qc1-4 and Qc2-SpaceCheck - Do we need to include logic to compare the results of Qc1-4 and Qc2-SpaceCheck before making final flag settings? ==== Assessment of Variability ==== Another type of Space Check is to determine the variabilty in the nearest neighbour field. If the variability is high then spatial algorithms are either not applied or the confidence parameters are lowered to mark any given result or check (e.g. use ftime=2 rather than ftime=1). ==== Comparison with other fields ==== Future Qc2 checks will involve comparison with radar, satellite data etc. How will this eventuality be flagged? For weather analysis fw is available (although possibly now proposed for other uses as well) and for climatological controls (e.g. comparison with expected monthly statistics) fclim is available. ==== Correction generated by Time (and/or Space) Interpolation ==== ^Controlflag Setting: |ftime=1 | ^Condition: |The algorithm is succesful | ^cfailed: |Algorithm applied is recorded | | --oo0oo-- || General assumption is that only actually replace a missing value (be it a single point or a set of points) if there is good confidence in the corrected estimates, e.g. the time interpolation and spatial interpolation agree. ftime=2 and ftime=3 are also available for use as defined in the kvalobs original specification. ==== Outlier detection ==== The Space Check and Time Interpolation methods may also identify outliers, e.g. Dip test. The existing specification for controlinfo(3), fs can capture such events from Qc2 checks. ==== Other Algorithms ==== TBD ==== General use of fw to log Qc2 corrections ==== This specification is taken from Slide 3 of [[https://kvalobs.wiki.met.no/lib/exe/fetch.php?id=kvoss%3Asystem%3Aqc2%3Arequirements&cache=cache&media=kvoss:system:qc2:flags_in_qc2.ppt| Flags in Qc2]]. ^fw=0 |Not controlled | ^fw=1 |Controlled, found OK | ^fw=2 |Controlled. Slightly suspect value, not corrected (changed) | ^fw=3 |Controlled. Highly suspect value, not corrected (changed) | ^fw=4 |Controlled. Erronous value, not corrected (new) | ^fw=5 |Controlled. Erronous value, corrected automatically (new) | ^ - | - | ^fw=8 |Outlier, rejected (new) | ===== Turn on Qc2 in useinfo? ===== The kvalobs library has a built in function which sets useinfo based on controlinfo settings. bool setUseFlags(const kvControlInfo& cinfo); that is located in ... src/lib/kvalobs/kvDataFlag.cc The setUseFlags already anticipates the setting of user flags in response to QC2 controls. However, most of the logic is commented out in the operational code, as indicated below: bool kvControlInfo::qc2dDone() const { return false;//flag( f_fs ) or flag( f_ftime ) or flag( f_fw ) or flag( f_fstat ); } ... bool kvControlInfo::qc2mDone() const { return false;//flag( f_fclim ) or flag( f_fd ); } ... ui[2]= 9; //if ( cinfo.qc1Done() or cinfo.qc2Done() or cinfo.hqcDone() ) ... ui[ 3 ] = 9; //if ( cinfo.qc1Done() or cinfo.qc2Done() or cinfo.hqcDone() ) ... ui[ 4 ] = 9; // NB: After useinfo[2] //if ( cinfo.qc1Done() or cinfo.qc2Done() or cinfo.hqcDone() ) Implications: * Need for regression testing when change in kvalobs base libraries is propagated to QC1 * Further need to investigate/test effect on reduced set of use info flags in the downstream data warehouses. In the ProcessUnitT example below the effect of turning on the Qc2 useinfo is visible; i.e. bool kvControlInfo::qc2dDone() const { return flag( f_fs ) or flag( f_ftime ) or flag( f_fw ) or flag( f_fstat ); } ... bool kvControlInfo::qc2mDone() const { return flag( f_fclim ) or flag( f_fd ); } ... ui[2]= 9; if ( cinfo.qc1Done() or cinfo.qc2Done() or cinfo.hqcDone() ) ... ui[ 3 ] = 9; if ( cinfo.qc1Done() or cinfo.qc2Done() or cinfo.hqcDone() ) ... ui[ 4 ] = 9; // NB: After useinfo[2] if ( cinfo.qc1Done() or cinfo.qc2Done() or cinfo.hqcDone() ) ===== Some empirical tests ===== Generate a sample Qc2 controlinfo value and observe what codes the above generates for useinfo. (Please suggest additional test cases) ^Qc2 Flag or Specific Algorithm ^Controlinfo ^Resulting useinfo ^ |RR24 Redistribution |[1000001000007000] |[5896900000000000]| |RR24 Redistribution |[1140001000007000] |[5896900000000001]| |ftime from TAN TAX interpolation |[1000600100000000] |[5033700000000001]| |ftime |[1000600200000000] |[5033700000000001]| |ftime |[1000600300000000] |[5033700000000001]| |fnum |[1100700000000000] |[7000000000000001]| |fnum |[1100800000000000] |[7000000000000001]| |fnum |[1100900000000000] |[7000000000000001]| |fnum |[1100A00000000000] |[7000000000000001]| |fclim |[1111000000010000] |[5000000000000000]| |fclim |[1111000000020000] |[5010500000000001]| |fclim |[1111000000030000] |[5033500000000001]| |fw |[1111000010000000] |[5000000000000000]| |fw |[1111000020000000] |[5010500000000001]| |fw |[1111000030000000] |[5033500000000001]| |fw |[1111000040000000] |[5000000000000001]| |fw |[1111000050000000] |[5000000000000001]| |fw |[1111000080000000] |[5000000000000001]| |control (no Qc2) |[1111000000000000] |[5000000000000000]| |control (no Qc2) |[1111000000000000] |[5000000000000000]| |control (no Qc2) |[1000001000000000] |[9899900000000000]| Furthermore, here is an example of [[kvoss::system::qc2::flag::regression|Qc1 Regression tests]]. ===== Open Issues ===== - Shall src/lib/kvalobs/Qc2_kvDataFlag.cc be updated to include the Qc2 elements. This is part of the standard Qc1 software and this change would therefore require regression testing. Who is reposnsible for this decision and to carry out the Qc1 regression testing? (//may be these results of some preliminary [[kvoss:system:qc2:flag:regression|regression tests]] are sufficient?// :-) Please comment! :-) - Use fw to record general information on Qc2 controls or keep reserved for weather analysis? See Slide 3 of [[https://kvalobs.wiki.met.no/lib/exe/fetch.php?id=kvoss%3Asystem%3Aqc2%3Arequirements&cache=cache&media=kvoss:system:qc2:flags_in_qc2.ppt| Flags in Qc2]]. - Given this overview of Qc2 control flagging is the specification for the corresponding useinfo in place ... what still needs to be done?