Address FIAS in the PostgreSQL environment. Part 1

As if we did not treat the address as FIAS they need to work, because it is the only nationwide directory of addresses. So sooner or later you have to solve the problem of when the location of the real estate, legal and physical address with the address of the system.

This article describes the experience of working with a list adresovana elements of the FIAS that are loaded into a database under PostgreSQL. To work with adresource elements FIAS created several functions in the language PL/pgSQL.

Full text of the article consists of 4 parts. In the first half of this part of the article provides comments on the implementation of the function. The second source text of the function. Those readers who are only interested in the source code, we offer go directly to the App.

the

Pedigree adresource


Let's start with an example. Call function fstf_AddressObjects_AddressObjecttree('bfc1236d-b5d2-4734-a238-3b1e4830e963') will generate the following list of records.

the table 1. The function execution result fstf_AddressObjects_AddressObjecttree
the the the the the
AOGUID CurrStatus ActStatus AOLevel ShortName FormalName
db9c4f8b-b706-40e2-b2b4-d31b98dcd3d1 0 1 1 edge Krasnoyarsk
625497d3-22de-4390-b4b4-2febfbfc15ce 0 1 3 rn Balahtinskoj
39da6405-b3e6-4baf-b332-d47b73b4d5fb 0 1 6 p Mighty
bfc1236d-b5d2-4734-a238-3b1e4830e963 0 1 7 ul

Upon closer examination, you notice that the argument passed to the function the ID of the item (AOGUID) "New street", received four entries:

the
    the
  • entry with the characteristics of the New street;
  • the
  • the parent record on the village, district and province, which belongs to the street.

The function there is another optional parameter is the actuality (CurrStatus), which you can view the pedigree not only relevant adresource element, but is already outdated.

Full text of the function is given in the Appendix in subsection Create function fstf_AddressObjects_AddressObjecttree

the

From the beginning


If you know how to construct tables FIAS, you can skip this section.

The need for such a function is dictated by the fact that the list adresovany elements of FIAS (ADDROBJ) is a tree structure, where each element refers to the ID (AOGUID) parent record field value ParentGUID. Ie browsing source list, usually see a long list of streets. To determine in which locality is it is necessary for the ParentGUID value to find the entry with the specified element ID.



the Fig. 1. Hierarchy adresovana elements of FIAS.

This is not a misprint and not a reservation. The ParentGUID value can be found not just one but several entries. From which it follows that ID adresource element is the primary key for the table containing the list ADDROBJ.

The fact that the list adresovana elements, with each element stores the history of its "rename". I.e. under one item ID is stored not only the current name of the element, but its previous names. Ie, a separate list entry ADDROBJ stores data about adresource element, and characteristics of a period during which it was the actual name of the element.

the table 2. The history of the street "Krasnoyarsk Krai, R-n Taimyr Dolgan-Nenets, g Dudinka, p Lewinsky Sands, ul Beregovaya"
the the the the the the the the the
record ID, AOID ID the previous record PrevID ID of the next record NextID Tag for KLADR CurrStatus the actuality ActStatus start date of the period StartDate period end date EndDate
fcf51361-5494-4edc-a6bc-d5c0d471c729 2a993f3b-5743-426c-8b7d-b5c7affe49cd 0 1 25.11.2015 0:00 06.06.2079 0:00
2a993f3b-5743-426c-8b7d-b5c7affe49cd 9199c92b-18a5-431a-8b13-f54abe36e84f fcf51361-5494-4edc-a6bc-d5c0d471c729 7 0 30.09.2015 0:00 25.11.2015 0:00
9199c92b-18a5-431a-8b13-f54abe36e84f b06ff65e-aadb-42eb-9c70-a8548a40645c 2a993f3b-5743-426c-8b7d-b5c7affe49cd 6 0 28.09.2015 0:00 30.09.2015 0:00
b06ff65e-aadb-42eb-9c70-a8548a40645c a997aa7c-f2b1-4b7e-9471-5fc1f51a6752 9199c92b-18a5-431a-8b13-f54abe36e84f 5 0 08.04.2013 0:00 28.09.2015 0:00
a997aa7c-f2b1-4b7e-9471-5fc1f51a6752 b27d8e44-ade1-4dd0-b4b5-5eaa7205ff0b b06ff65e-aadb-42eb-9c70-a8548a40645c 4 0 11.01.2013 0:00 08.04.2013 0:00
b27d8e44-ade1-4dd0-b4b5-5eaa7205ff0b 327b28cc-5171-43c6-bd88-a0a2172bbf71 a997aa7c-f2b1-4b7e-9471-5fc1f51a6752 3 0 07.12.2012 0:00 11.01.2013 0:00
327b28cc-5171-43c6-bd88-a0a2172bbf71 fb7b54db-7efb-4aaf-b4cb-2364b35b80b3 b27d8e44-ade1-4dd0-b4b5-5eaa7205ff0b 2 0 01.02.2012 0:00 07.12.2012 0:00
327b28cc-5171-43c6-bd88-a0a2172bbf71 fb7b54db-7efb-4aaf-b4cb-2364b35b80b3 b27d8e44-ade1-4dd0-b4b5-5eaa7205ff0b 1 0 01.02.2012 0:00 07.12.2012 0:00

The order of the periods of the relevance of the name adresource element, can be determined by viewing the two mixed lists. To do this, every record of the period of relevance of the element contains two pointers to the previous (PrevID) and subsequent periods (NextID). The first period adresource item is missing a pointer to the previous season, and the last (current) a pointer to the next period.



the Fig. 2. Basic field recording adresource element FIAS.

Validity is characterized by the start and end dates of the period, respectively StartDate and EndDate. The start date of the first period and the last are conditional values. For the start date of the first period is taken "01.01.1900 0:00" and end date of the last (current) period is taken "06.06.2079 0:00".

Actual (current) title adresource item specified in the record of the last period, provided that it has not been completed, i.e. the period end date is greater than or equal to the current date.

To simplify the search entry on current period of an element, excluding the start and end date period entered two more fields: CurrStatus and ActStatus.

ActStatus takes quite the expected values are "1" is the current version of the feature, "0" – no current or historical version, as indicated in the directory.

With the values of the fields CurrStatus is more complicated. Using the values solved for two
tasks at the same time: set the identifier of each version of a record about adresource element and assigned to the actuality recordings. Therefore the last relevant item record contains the value "0" in this field, and all historical records are numbered in order of appearance – "1" the earliest record, following it at the time, "2", etc.

Table 2 contains a list of records with versions of descriptions of the streets of the Coastal village Lewinsky Sands. In this list previous to the current record contains a "7" in the field CurrStatus.

the

How it works




the Fig. 3. A simplified implementation of the function fstf_AddressObjects_AddressObjecttree.

For the implementation of the function suggests to use recursive query similar to that shown in Fig. 3, where a_AOGUID ID adresource element a_CurrStatus the actuality for KLADR. Both values are passed to functions through parameters. Like any recursive query, this consists of two parts: the first part is the first entry of the element with the ID a_AOGUID, the following recursive part, are all the current parent record in relation to records obtained by the previous iterations. Navigates to the parent element is the link ParentGUID current record.It is important that each iteration would result in only one record. This imposed restrictions on the characteristic value CurrStatus.
An example of the result of a recursive query in the absence of uniqueness, recording at least one step of iteration is given in Fig. 3.

The question arises – why the restriction on the value CurrStatus implemented using a nested subquery, not by assigning actual values?

First, because the actual record of an item does not always CurrStatus=0 as demonstrated by table 4.

Secondly, it is necessary that the function returned the result to date of the names of the element. In particular, even in the case where the path between the start and end of elements to meet the element that is not declared any relevant records.

the table 3. The function execution result on an item with non-unique basis of choice
the the the the the the
AOGUID CurrStatus ActStatus AOLevel ShortName FormalName
db9c4f8b-b706-40e2-b2b4-d31b98dcd3d1 0 1 1 edge Krasnoyarsk
db9c4f8b-b706-40e2-b2b4-d31b98dcd3d1 0 1 1 edge Krasnoyarsk
3d4c8618-9e22-4838-8f89-80da6851da90 0 1 3 rn Rybinsk
719b789d-2476-430a-89cd-3fedc643d821 51 0 4 g Krasnoyarsk
719b789d-2476-430a-89cd-3fedc643d821 51 1 4 g Krasnoyarsk

If a_CurrStatus =0 to use as a selection condition of the only record for the item, the function will not return for the records of the town of zaozyorny, records of which are given in table 4. At the same time, if you ask a_CurrStatus =51, then the result will be as shown in table 3.

Condition:

the
ao.currstatus = (SELECT MIN(iao.currstatus) FROM WHERE fias_AddressObjects iao ao.aoguid = iao.aoguid)

... which is present in the second part of the recursive query is not entirely correct, assuming the rules for assigning values for the characteristic CurrStatus. Indeed, if among the records adresource the item is not relevant (CurrStatus=0), the most "fresh" is the entry with the maximum, i.e. the last used value of the characteristic CurrStatus.

When using the above condition, the records are out of date item selects the oldest.

the table 4. Element with repeating is not equal to zero value CurrStatus.
the the the
AOGUID CurrStatus ActStatus AOLevel ShortName FormalName
719b789d-2476-430a-89cd-3fedc643d821 51 0 4 g Krasnoyarsk
719b789d-2476-430a-89cd-3fedc643d821 51 1 4 g Krasnoyarsk

Therefore, a better solution would be to use the following conditions:

the
ao.currstatus = CASE WHEN 0 < ALL(SELECT iao.currstatus FROM fias_AddressObjects iao 
WHERE ao.aoguid = iao.aoguid)
THEN (SELECT MAX(iao.currstatus) FROM fias_AddressObjects iao 
WHERE ao.aoguid = iao.aoguid)
ELSE 0 END

But, in this condition two of the subquery. So you have to choose between semantic severity
request and effectiveness of its implementation.

Excuse the use of the first option conditions is the absence of actual differences in the names adresource element in the historical records.

So as of October 13, 2016, 26728 adresovana elements of the Krasnoyarsk territory in 19865, there are historical records. Only for 1350 elements (6,8% of the number of elements that have a history) there are differences in the names of one and the same element. Ie for 93,2% of the elements of the first and second terms will return the same list of names. Differences are possible only in the characteristic values CurrStatus, which, given the purpose of the function, can be neglected.

Completely replacing the sign CurrStatus on ActStatus impossible. The condition ActStatus=1 uniquely selects the current item record, but from historical records it is necessary to use both signs. This solution is given in Appendix Create function fstf_AddressObjects_AddressObjecttree.

the

the APPLICATION


Create function fstf_AddressObjects_AddressObjecttree


DROP FUNCTION IF EXISTS fstf_AddressObjects_AddressObjecttree(a_AOGUID VARCHAR(36), a_CurrStatus INTEGER); /*************************************************************************/ /* Returns a tree (a list of related rows) with the characteristics */ /* adresource element */ /*************************************************************************/ CREATE OR REPLACE FUNCTION fstf_AddressObjects_AddressObjecttree( a_AOGUID VARCHAR(36), /* Global unique identifier */ /* adresource element*/ a_CurrStatus INTEGER default NULL /* Status relevance KLADR 4: */ /* 0 is relevant */ /* 1-50 - historical, i.e. */ /* element was renamed */ /* this entry is given one */ /* of its former names */ /* 51 - perepodchinenie */ ) RETURNS TABLE (rtf_AOGUID VARCHAR(36), rtf_CurrStatus INTEGER, INTEGER rtf_ActStatus, rtf_AOLevel INTEGER,rtf_ShortTypeName VARCHAR(10), rtf_AddressObjectName VARCHAR(100)) AS $BODY$ DECLARE c_ActualStatusCode CONSTANT INTEGER :=1; /* Sign of current record */ /* adresource element */ c_NotActualStatusCode CONSTANT INTEGER :=0; /* code Value of the current record */ v_AOGUID VARCHAR(36); /* ID adresource element */ v_ParentGUID VARCHAR(36); /* parent ID */ v_CurrStatus INTEGER; /* Status of the relevance of KLADR 4*/ v_ActStatus INTEGER; /* Status relevance */ /* adresource element of the FIAS. */ v_AOLevel INTEGER; /*Level adresource element */ v_ShortName VARCHAR(10); /* Short name of element type */ v_FormalName VARCHAR(120); /* Formalized the name of the element */ v_Return_Error INTEGER; /* return Code */ --*********************************************************************** --*********************************************************************** BEGIN IF a_CurrStatus IS NOT NULL THEN SELECT INTO v_AOGUID,v_ParentGUID,v_CurrStatus,v_ActStatus,v_AOLevel, v_ShortName, v_FormalName ao.AOGUID,ao.ParentGUID,ao.CurrStatus,ao.ActStatus,ao.AOLevel, ao.ShortName, ao.FormalName FROM fias_AddressObjects ao WHERE ao.AOGUID=a_AOGUID AND ao.CurrStatus=a_CurrStatus; ELSE SELECT INTO v_AOGUID,v_ParentGUID,v_CurrStatus,v_ActStatus,v_AOLevel, v_ShortName, v_FormalName ao.AOGUID,ao.ParentGUID,ao.CurrStatus,ao.ActStatus,ao.AOLevel, ao.ShortName, ao.FormalName FROM fias_AddressObjects ao WHERE ao.AOGUID=a_AOGUID AND ao.ActStatus=c_ActualStatusCode; IF NOT FOUND THEN SELECT INTO v_AOGUID,v_ParentGUID,v_CurrStatus,v_ActStatus,v_AOLevel, v_ShortName, v_FormalName ao.AOGUID,ao.ParentGUID,ao.CurrStatus,ao.ActStatus,ao.AOLevel, ao.ShortName, ao.FormalName FROM fias_AddressObjects ao WHERE ao.AOGUID=a_AOGUID AND ao.ActStatus=c_NotActualStatusCode AND ao.currstatus = (SELECT MAX(iao.currstatus) FROM fias_AddressObjects iao WHERE ao.aoguid = iao.aoguid); END IF; END IF; RETURN QUERY SELECT v_AOGUID,v_CurrStatus,v_ActStatus,v_AOLevel, v_ShortName,v_FormalName; WHILE v_ParentGUID IS NOT NULL LOOP SELECT INTO v_AOGUID,v_ParentGUID,v_CurrStatus,v_ActStatus,v_AOLevel, v_ShortName, v_FormalName ao.AOGUID,ao.ParentGUID,ao.CurrStatus,ao.ActStatus,ao.AOLevel, ao.ShortName,ao.FormalName FROM fias_AddressObjects ao WHERE ao.AOGUID=v_ParentGUID AND ao.ActStatus=c_ActualStatusCode; IF NOT FOUND THEN SELECT INTO v_AOGUID,v_ParentGUID,v_CurrStatus,v_ActStatus,v_AOLevel, v_ShortName,v_FormalName ao.AOGUID,ao.ParentGUID,ao.CurrStatus,ao.ActStatus,ao.AOLevel, ao.ShortName, ao.FormalName FROM fias_AddressObjects ao WHERE ao.AOGUID=v_ParentGUID AND ao.ActStatus=c_NotActualStatusCode AND ao.currstatus = (SELECT MAX(iao.currstatus) FROM fias_AddressObjects iao WHERE ao.aoguid = iao.aoguid); END IF; RETURN QUERY SELECT v_AOGUID,v_CurrStatus,v_ActStatus,v_AOLevel,v_ShortName, v_FormalName; END LOOP; END; $BODY$ LANGUAGE plpgsql; COMMENT ON FUNCTION fstf_AddressObjects_AddressObjecttree(a_AOGUID VARCHAR(36), a_CurrStatus INTEGER) IS 'Returns the tree (a list of related rows) characteristics adresource element'; --ROLLBACK TRANSACTION; COMMIT TRANSACTION; SELECT * FROM fstf_AddressObjects_AddressObjecttree('719b789d-2476-430a-89cd-3fedc643d821',51) ORDER BY rtf_AOLevel; SELECT * FROM fstf_AddressObjects_AddressObjecttree('719b789d-2476-430a-89cd-3fedc643d821') ORDER BY rtf_AOLevel;
Thank you for your attention!
Article based on information from habrahabr.ru

Комментарии

Популярные сообщения из этого блога

ODBC Firebird, Postgresql, executing queries in Powershell

Installation LivestreetCMS on MODX Revolution package 10 clicks

The Ministry of communications wants to ban phones without GLONASS