Введение
При решении задачи хранения и обеспечения доступа к историческим данным очень часто возникает задача выгрузки архивных данных на резервный носитель (например, на магнитную ленту) с возможностью оперативного восстановления этой информации и обеспечения доступа к ней пользователей. Эта проблема наиболее актуальна для хранилищ данных, хотя может применяться и для обработки архивных данных OLTP-систем.
В данной статье описывается способ решения этой проблемы с помощью опции Partitioning базы данных Oracle Database.
Ниже представлена иллюстрация данного подхода, который включает в себя: идентификацию исторических данных, их перемещение во временную таблицу, экспорт и копирование на резервный носитель.
Иллюстрация подхода перемещения исторических данных
Первым шагом является определение секций, содержащих исторические данные. Исторические данные – это данные за прошлые периоды, над которыми в будущем не будут проводиться операции изменения. Затем секции, содержащие исторические данные, перемещаются в заранее подготовленную временную таблицу. Следующим шагом производится экспорт метаданных для Transport Table Space (TTS). В заключении производится перенос файла с метаданными и файла табличного пространства на резервный носитель.
Далее будет детально рассматриваться процесс экспорта и импорта табличного пространства для одного раздела секционированной таблицы CALLS (информация о телефонных звонках клиентов) схемы DWH.
Описанный подход был принят как основной для задач перемещение и восстановление исторических данных хранилища корпоративной информации компании “ОАО Ростелеком”.
4.1, EXP/IMP command line help information
Before using $ exp/imp -help
How to make exp’s help displayed in different character sets: set nls_lang=simplified chinese_china.zhs16gbk, by setting environment variables, exp’s help can be displayed in Chinese, if set nls_lang=American_america. character set, then the help is in English
All the parameters of EXP (the default values of the parameters are in parentheses):
All parameters of IMP (the default values of the parameters are in parentheses):
A note on the increment parameter: The increment of exp/imp is not a real increment, so it is best not to use it. How to use: Exp parameter_name=value or Exp parameter_name=(value1,value2) Just enter the parameter help=y to see all the help.
5.3、Read DMP info
In many cases, others directly threw you a dmp file and did not tell you what tool to use to export, nor did you know whether it was exported according to the table or according to the user. In fact, you don’t need to ask others for this information. Some information in the header of the dmp file can tell us
The dmp file exported with Oracle’s exp tool also contains character set information. The second and third bytes of the dmp file record the character set of the dmp file. If the dmp file is not large, such as only a few M or tens of M, you can open it with UltraEdit (hexadecimal format), look at the content of the second and third bytes, such as 0354, and use the following SQL to find out its corresponding characters set:
SQL> select nls_charset_name(to_number(‘0354’,‘xxxx’)) from dual;
ZHS16GBK
If the dmp file is very large, such as 2G or more (this is also the most common situation), it is slow to open with a text editor or cannot be opened at all, you can use the following command (on a unix host):
cat exp.dmp |od -x|head -1|awk ‘{print $2 $3}’|cut -c 3-6
Then use the above SQL to get its corresponding character set.
The right part is the version information of exp and so on. If it is exported according to the user, there will be: DCHENQY.RUSERS information. If it is exported according to the table, there will be information such as: DCHENQY.RTABLES.
As shown in the figure: export according to the table
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Modify the character set of the dmp file
As mentioned above, the 2nd and 3rd bytes of the dmp file record the character set information, so by directly modifying the contents of the 2nd and 3rd bytes of the dmp file, you can «cheat» the oracle inspection. In theory, it can only be modified from subset to superset, but in many cases it can also be modified without the relationship between subset and superset. Some of our commonly used character sets, such as US7ASCII, WE8ISO8859P1, ZHS16CGB231280, ZHS16GBK Basically it can be changed. Because only the dmp file is changed, it has little effect.
There are many specific modification methods, the easiest is to directly modify the second and third bytes of the dmp file with UltraEdit.
For example, if you want to change the character set of the dmp file to ZHS16GBK, you can use the following SQL to find out the hexadecimal code corresponding to this character set: SQL> select to_char(nls_charset_id(‘ZHS16GBK’), ‘xxxx’) from dual;
0354
Then modify the 2 and 3 bytes of the dmp file to 0354.
If the dmp file is very large and cannot be opened with ue, you need to use the program method
4.4, different versions of EXP/IMP issues
Generally speaking, the problem of importing from a low version to a high version is not big. The trouble is to import the data of the high version into the low version. Before Oracle9i, the EXP/IMP between different versions of Oracle can be solved by the following methods:
But in 9i, the above method does not solve the problem. If you directly use the lower version of EXP/IMP, the following error will occur:
This is already a published BUG, you need to wait until Oracle10.0 to resolve, the BUG number is 2261722, you can go to METALINK to view the detailed information about this BUG.
BUG belongs to BUG. Our work is still to be done. Before Oracle’s support, we will solve it ourselves. Execute the following SQL in Oracle9i to rebuild the exu81rls view.
EXP/IMP can be used across versions, but the versions of EXP and IMP must be used correctly: 1. Always use the IMP version to match the database version. For example, if you want to import to 817, use 817 IMP tool. 2. Always use the EXP version to match the lowest version of the two databases. For example, if you import from 9201 to 817, use the 817 version of EXP tool.
Определение исторических данных
Для выявления исторических данных, то есть тех данных которые не будут больше изменяться, администратор должен ежемесячно проводить мониторинг их появления. Перечень данных, которые следует признавать историческими, определяют бизнес-требования. Часто правило определения исторических данных сводится к такому условию: историческими признаются те данные, срок хранения которых превышает определенный лимит, например, 5 лет от текущего момента.
Для автоматизации выявления исторических данных в конкретной таблице фактов, возможно выполнение следующего запроса (обращение к словарю Oracle Database):
Данный запрос вернет перечень разделов (см. поле PARTITION_NAME) по таблицам, данные в которых являются историческими (срок хранения превышает 5 лет). Эти данные необходимо архивировать и перенести на резервный носитель.
5.2, imp optimization
Reference document: Tuning Considerations When Import Is Slow
Refer to the following ways to optimize imp:
When the amount of data required for exp/imp is relatively large, this process requires a relatively long time. We can use some methods to optimize the operation of exp/imp. exp: use direct path direct=y Oracle will avoid the SQL statement processing engine, read data directly from the database file, and then write it to the export file. can be observed in the export log: exp-00067: table xxx will be exported in conventional path
If the direct path is not used, the value of the buffer parameter must be large enough. There are some parameters that are not compatible with direct=y, and it is not possible to use direct path to export movable tablespace, or use query parameter to export database subset.
When the imported and exported databases are running under different os, the value of the recordlength parameter must be consistent.
4.3, EXP/IMP and character set
When importing and exporting data, we should pay attention to the problem of character set. In the EXP/IMP process, we need to pay attention to the four character set parameters: export client character set, export database character set, import client character set, import database character set. We first need to check these four character set parameters. View information about the character set of the database:
Let’s check the character set information of the client:
In windows, query and modify NLS_LANG can be done in the registry:
xx refers to the system number when there are multiple Oracle_HOME.
In unix:
Modified to be available in unix:
It is usually best to set the client character set to be the same as that of the database when exporting. When importing data, there are mainly the following two situations:
2.6, impdp optimization
1. How does impdp speed up index creation?
In What Order Are Indexes Built During Datapump Import (IMPDP) and How to Optimize the Index Creation (Doc ID 1966442.1)
4. Detailed explanation of EXP/IMP usage
Starting from oracle 10g, expdp/impdp has been strongly promoted to replace the previous exp/imp. The latest 19c version still retains the exp/imp function for compatibility. But exp/imp is not without its advantages. It is still very useful for quickly exporting a small table, at least without setting the directory.
Import/Export are the two oldest command line tools that ORACLE survived. In fact, I never think Exp/Imp is a good backup method. The correct way to say Exp/Imp is that Exp/Imp can only be a good dump tool, especially It has a lot of credit in the dump of small databases, table space migration, table extraction, and detection of logical and physical conflicts. Of course, we can also use it as a logical auxiliary backup after the physical backup of a small database, which is also a good suggestion. For larger and larger databases, especially the emergence of TB-level databases and more and more data warehouses, EXP/IMP is becoming more and more powerless. At this time, database backups have turned to RMAN and third-party tools. The following explains the use of EXP/IMP.
2.5, remote export and import
2.5.1, use network_link
The expdp/impdp function of oracle 10g+ version can use the dblink between different databases to directly import the source data on the target side using the impdp command (network_link).
Usage 1: Use the expdp command to put the remote dump file into the local directory
Local execution: the user name and password are local
Usage 2: Use the impdp command on the target side to directly import the source data
1. Create a dblink from the target end to the source end
The premise is that tnsping passes, and tnsnames.ora needs to be configured
2. Use the Impdp command on the target side to import the source data
2.5.2, use tnsname to connect to the database for export and import
Scenario 1: When a database (single machine) is installed with multiple instances (mostly in the development environment), do not be careful about the instance problem. You must add @name to the export and import statements, for example: first set export ORACLE_SID=xxx
Scenario 2: You cannot directly execute commands locally in the database, you need to connect to the database with the client and execute the export and import commands
3. Matters needing attention The user used for impdp on the target side needs to have dblink permission. The connect user in dblink needs to have the permission to export source_objects. Remote direct impdp does not support parallel, nor does it support certain data types, please refer to the official oracle documentation for details.
2.2. Filter objects
2.2.1, use include
How to export the specified package, package body, procedure, etc. Please refer to the official documentation. In the syntax, include= is followed by object_type instead of just the table
2.2.2, use the query option
Handle single quotes and double quotes in query parameters, there are single quotes or double quotes in query options in expdp
2.2.3, use EXCLUDE to exclude objects
For example: export the DDL statements of the whole library, use exclude to exclude unnecessary system user DDL when importing
2.2.4. Use content to filter and export metadata or only data
If you only export a partition or sub-partition of a table, you only need to export the data content=data_only, and you don’t need to replace
If you only need to export the table structure, content=metadata_only
Impdp import skills, add content=metadata_only parameter when impdp import table, you can get table ddl statement and insert specific statement
4.2, EXP/IMP common options
1. FULL, this is used to export the entire database, when used together with ROWS=N, you can export the structure of the entire database. E.g: exp userid=test/test file=./db_str.dmp log=./db_str.log full=y rows=n compress=y direct=y
- OWNER and TABLE, these two options are used to define the object of EXP. OWNER defines the export of the specified user’s object; TABLE specifies the table name of EXP, for example: exp userid=test/test file=./db_str.dmp log=./db_str.log owner=duanl exp userid=test/test file=./db_str.dmp log=./db_str.log table=nc_data,fi_arap
3. BUFFER and FEEDBACK, when exporting more data, I will consider setting these two parameters. E.g: exp userid=test/test file=yw97_2003.dmp log=yw97_2003_3.log feedback=10000 buffer=100000000 tables=WO4,OK_YT
4. FILE and LOG, these two parameters respectively specify the backup DMP name and LOG name, including file name and directory, see the example above.
5. The COMPRESS parameter does not compress the content of the exported data. Used to control how the storage statement of the exported object is generated. The default value is Y, using the default value, the init extent of the storage statement of the object is equal to the sum of the extent of the currently exported object. It is recommended to use COMPRESS=N.
- FILESIZE This option is available in 8i. If the exported dmp file is too large, it is best to use the FILESIZE parameter and limit the file size not to exceed 2G. Such as: exp userid=duanl/duanl file=f1,f2,f3,f4,f5 filesize=2G owner=scott This will create a series of files such as f1.dmp, f2.dmp, each of which is 2G in size, if the total exported is less than 10G EXP does not have to create f5.dmp.
IMP commonly used options
1. FROMUSER and TOUSER, use them to import data from one SCHEMA to another SCHEMA. For example: suppose we export the object of test when doing exp, and now we want to import the object to the user: imp userid=test1/test1 file=expdat.dmp fromuser=test1 touser=test1
2. IGNORE, GRANTS and INDEXES. The IGNORE parameter will ignore the existence of the table and continue to import. This is useful when you need to adjust the storage parameters of the table. We can first build the table with reasonable storage parameters according to the actual situation, and then import it directly data. And GRANTS and INDEXES indicate whether to import authorization and index. If we want to rebuild the index with new storage parameters, or in order to speed up the import speed, we can consider setting INDEXES to N, and GRANTS is generally Y. For example: imp userid=test1/test1 file=expdat.dmp fromuser=test1 touser=test1 indexes=N
2.0, use skills
1. Use parfile parameter file
Use sysdba user userid=’/ as sysdba’ in the parameter file
When there is Chinese in the parameter file, you can put the Chinese characters that need to be input into a table, refer to ID 7154316.8
2. Dumpfile variable %U
Suggested naming rules: dbname_business_expdp&impdp_%U.dmp.date
3. Remove the storage attribute when impdp, so as to avoid the initial segment of the DDL statement from making the table space too large
transform=segment_attributes:n
4. In order to speed up the export, you can add parallel and exclude some indexes, authorizations, and statistics
EXCLUDE=STATISTICS,INDEX,GRANTS
5. Compress dmp files, compression=ALL
6. Specify parallel, parallel = 4
7. Forgot to open parallel or specify the dump file size, etc.
After attaching enters the expdp task, enter in the command line interactive mode: