-
Notifications
You must be signed in to change notification settings - Fork 3
Duplicate ping codes in sensor table causing duplicate detections #240
Comments
Not sure. There shouldn't be any sensors belonging to the same tag which have the same ping code (but ping codes for different tags can be duplicated). |
Okay so if I understand that correctly, the query below shouldn't return any result: SELECT transmitter_id, tag_id, COUNT(*) AS no_ids
from sensor
group by transmitter_id, tag_id
having COUNT(*) > 1 order by no_ids however it returns 261 duplicate sensors with the same tag_id. Have I got that right @jkburges? If so, should we do the same thing as for #241, i.e. csv export for @astecken to ask people to double check the info they've entered? |
Yes, I think the above query is correct. I had a look at a couple of tags having duplicate ping codes, e.g.: https://aatams.emii.org.au/aatams/tag/show/128160726 The interesting thing is that each sensor is of a different type - so that gives us a bit of a clue (perhaps) as to why this bug is happening. So, yes, we do need to find out from tag owners what the sensor types are actually on the tag, as well as actually fixing the bug to stop this from happening in the first place. |
Yes, pls, send csv file through and I will check with owners. Thanks! |
Here's the csv you were requesting @astecken: https://drive.google.com/file/d/0B0gOwvJSYXhsdngyMFhyMUNraTg/view?usp=sharing |
@astecken how are you going with this? |
Hi Xavier, I have contacted all collaborating researchers and am waiting for their responses. I'll let you know once I hear from them! |
It looks like people had problems in the past when entering their tags, i.e. they couldn't add a sensor to an existing tag ID. Hopefully, this will be resolved this afternoon after the 3.14.4 release; and then hopefully sorting out issue #253 at the same time :) |
@astecken any progress on that? |
Yes, Xavier, all sorted. The issue can be closed. Thanks! |
Hmm and yet when I run this query |
There's two parts to this: fixing the erroneous data (@astecken was contacting people re that) and adding validation so that it can't happen in the first place. Part B is definitely not done, so this can't be closed yet.
Yes. |
Thanks for clearing things up @jkburges. @astecken I just re-ran the query I used to extract https://drive.google.com/file/d/0B0gOwvJSYXhsdngyMFhyMUNraTg/view?usp=sharing and I still get about 270 rows, did you end up fixing the erroneous data in the web app? |
@astecken could you please respond to my previous message? I'm in the process of prioritising bug fixing for next iteration. |
Hi Xavier, I have cleaned up many of those duplicates already but there are some left still that I have to sort out. What I did is that I have added the pinger tags to the existing sensors. Thus, the pingers are still in the DB and need to be deleted once I have dealt with all the other ones. I will get to this after the workshop sometime and keep you posted. |
There's still a few left and I will deal with any outstanding issues in the new year. But yes, the delete function needs to be back before I can delete the double entries. Talk soon! |
@xhoenner In the excel spreadsheet that I just sent you is also one worksheet that has duplicate pinger entries. Could you delete all pingers so that only the sensor entries are in the DB, pls? Thanks! After that it might be useful to generate another list as above to see what's left :)
|
@astecken also asked to delete all transmitter IDs in the list below that are pingers.
|
@astecken it'd be useful if you could list in here the type of transmitter associated with each code space so that we may be able to implement some database constraints. That way we won't have to go through the process of fixing up wrongly assigned transmitter type on a regular basis. |
Queries to get rid of tag duplicates listed in #240
Code space Comments Does this help, @xhoenner ? |
In response to @astecken's code space vs. transmitter type constraints, here is below a breakdown of the current state of the db. I'll then produce a list of all the (A69) tags with the following code space 9002, 9004, 1601, and 9001 and wrongly assigned transmitter type based on your comment above so that you can advise what you want me to do. @astecken, not sure whether anything can be done for the other tag types, namely A180-1105, A180-9002, A69-1105, A69-1303, and A169-1206?
|
@astecken I can change the remaining 1601 and 9001 tags to be pingers, could you contact the owners of 9002 and 9004 tags to know what sensor these tag IDs had on board? |
@xhoenner Just send an email to Hugh to see if he can generate an extract from their DB to indicate which sensors we are dealing with. |
@xhoenner Apparently not straight forward for Hugh to extract this. I just started with the above list but it seems to be outdated. Could you generate an up-to-date one for me, pls? |
Updated list for you @fjaine, generated using the following SQL query. SELECT transmitter_id, tt.transmitter_type_name AS transmitter_type, p.name AS project_name
FROM sensor s
JOIN device d ON d.id = s.tag_id
JOIN project p ON p.id = d.project_id
JOIN code_map cm ON cm.id = d.code_map_id
JOIN transmitter_type tt ON tt.id = s.transmitter_type_id
WHERE (cm.code_map = 'A69-1601' AND transmitter_type_name != 'PINGER' AND transmitter_type_name != 'RANGE TEST') OR
(cm.code_map = 'A69-9001' AND transmitter_type_name != 'PINGER' AND transmitter_type_name != 'RANGE TEST') OR
(cm.code_map = 'A69-9002' AND transmitter_type_name = 'PINGER' AND transmitter_type_name != 'RANGE TEST') OR
(cm.code_map = 'A69-9004' AND transmitter_type_name = 'PINGER' AND transmitter_type_name != 'RANGE TEST')
ORDER BY code_map, transmitter_id, transmitter_type;
|
@julian1 the db constraints that need to be implemented to prevent this issue are as follows and refer to the code_map field in the A69-1601 pinger only @fjaine our developers will start working on fixing up this issue soon so could you confirm with Hugh urgently the above constraints and ask him the following questions:
Here's the revised breakdown of number of tag type for each code map.
|
@xhoenner I have contacted Hugh with the above questions - will let you know when he gets back to me |
@xhoenner here is more detail based on Hugh's reply: The below are correct: RANGE TEST TAGS: SENSOR TAGS:
Hugh also confirmed that for some tags sensors are easy to determine based on the name (e.g. V16T for temperature, V16PT for pressure& temp, etc.). But I don't think users have the option of specifying those names when registering tags (V16 or V16AP only choices in the dropdown menu)? Maybe the way to go would be to add a process that identified we are dealing with a sensor tag based on the code space entered, and then have mandatory checkboxes next to it for people to specify which sensors they are? |
Currently we can't assign two transmitters having different code spaces to a same tag @fjaine: when clicking on 'Add sensor' in the
|
This will be fixed with #347 and #373. I'm migrating the list in #240 (comment) to https://github.com/aodn/aatams-content/issues/57. |
Out of interest, why is https://github.com/aodn/aatams-content private? |
Originally it was created to avoid exposing inadvertently protected or embargoed data. We're mostly dealing with metadata issues in there and have left all issues related to the web app and design of the database in the aatams repo. |
Steps to reproduce
This bug was found when trying, with Talend, to copy the content of the valid_detection view and import it into a table with a primary key on the following fields (
receiver_deployment_id
,timestamp
,transmitter_id
, andscientific_name
). The harvester failed because it had found duplicate detections based on this primary key. This bug might be related to #119.What happens?
Running the following query returns two rows:
SELECT * FROM valid_detection WHERE transmitter_id = 'A69-9002-14774' AND timestamp = '2013-09-13 19:17:45+10';
What should happen?
Instead it should return a single detection. @jkburges thinks it is due to tag_ids in the sensor table having the same
ping_code
. Running the following query should return only a single row, however, two rows are returned.SELECT * FROM sensor WHERE transmitter_id = 'A69-9002-14774';
The text was updated successfully, but these errors were encountered: