MySQL Shell's parallel table import utility Show
About the UtilityMySQL Shell's parallel table import utility supports the output from MySQL Shell's table export utility, which can compress the data file it produces as output, and can export it to a local folder or an Object Storage bucket. The default dialect for the parallel table import utility is the default for the output file produced by the table export utility. The parallel table import utility can also be used to upload files from other sources. The data file or files to be imported can be in any of the following locations:
The data is imported to a single relational table in the MySQL server to which the active MySQL session is connected. When you run the parallel table import
utility, you specify the mapping between the fields in the data file or files, and the columns in the MySQL table. You can set field- and line-handling options as for the A number of functions have been added to the parallel table import utility since it was introduced, so use the most recent version of MySQL Shell to get the utility's full functionality. Input preprocessing From MySQL Shell 8.0.22, the parallel table import utility can capture columns from the data file or files for input preprocessing, in the same way as with a Up to MySQL Shell 8.0.20, the data must be imported from a location that is accessible to the client host as a local disk. From MySQL Shell 8.0.21, the data can also be imported from an Oracle Cloud Infrastructure Object Storage bucket, specified by the Up to MySQL Shell 8.0.22, the parallel table import utility can import a single input data file to a single relational table. From MySQL Shell 8.0.23, the utility is also capable of importing a specified list of files, and it supports wildcard pattern matching to include all relevant files from a location. Multiple files uploaded by a single run of the utility are placed into a single relational table, so for example, data that has been exported from multiple hosts could be merged into a single table to be used for analytics. Compressed file handling Up to MySQL Shell 8.0.21, the parallel table import utility only accepts an uncompressed input data file. The utility analyzes the data file, distributes it into chunks, and uploads the chunks to the relational table in the target MySQL server, dividing the chunks up between the parallel connections. From MySQL Shell 8.0.22, the utility can also accept data files compressed in the MySQL Shell's dump loading utility Requirements and Restrictions The parallel table import utility uses
To avoid a known potential security issue with Running the Utility The parallel table import utility requires an existing classic MySQL protocol connection to
the target MySQL server. Each thread opens its own session to send chunks of the data to the MySQL server, or in the case of compressed files, to send multiple files in parallel. You can adjust the number of threads, number of bytes sent in each chunk, and maximum rate of data transfer per thread, to balance the load on the network and the speed of data transfer. The utility cannot operate over X Protocol connections, which do not support In the MySQL Shell API, the
parallel table import utility is a function of the
The function returns void, or an exception in case of an error. If the import is stopped partway by the user with Ctrl+C or by an error, the utility stops sending data. When the server finishes processing the data it received, messages are returned showing the chunk that was being imported by each thread at the time, the percentage complete, and the number of records that were updated in the target table. The following examples, the first in MySQL Shell's JavaScript mode and the second in MySQL Shell's Python mode, import the data in a single CSV file
The following example in MySQL Shell's Python mode only specifies the dialect for the CSV file.
The following example in MySQL Shell's Python mode imports the data from multiple files, including a mix of individually named files, ranges of files specified using wildcard pattern matching, and compressed files:
The parallel table import utility can also be invoked from the command line using the mysqlsh command interface. With this interface, you invoke the utility as in the following examples:
When you import multiple data files, ranges of files specified using wildcard pattern matching are expanded by MySQL Shell's glob pattern matching logic if they are quoted, as in the following example. Otherwise they are expanded by the pattern matching logic for the user shell where you entered the mysqlsh command.
When you use the mysqlsh command's API reference argument to directly invoke
the parallel table import utility (the dash-dash-space sequence " Options for Importing TablesThe following import options are available for the parallel table import utility to specify how the data is imported: The name of the target database on the connected MySQL server. If you omit this option, the utility attempts to identify and use the schema name in use for
the current MySQL Shell session, as specified in a connection URI string, table:
" The name of the target relational table. If you omit this option, the utility assumes the table name is the name of the data file without the extension. The target table must exist in the target database. columns: An array of strings containing column names from the import file or files, given in the order that they map to columns in the target relational table. Use this option if the imported data does not contain all the columns of the target table, or if the order of the fields in the imported data differs from the order of the columns in the table. If you omit this option, input lines are expected to contain a matching field for each column in the target table. From MySQL Shell 8.0.22, you can use this option to capture columns from the import file or files for input preprocessing, in the same way as with a In this example in MySQL Shell's JavaScript mode, the second and fourth columns from the import file are assigned to the user variables decodeColumns:
A dictionary of key-value pairs that assigns import file columns
captured as user variables by the In this example in MySQL Shell's JavaScript mode, the first input column from the data file is used as the first column in the target table. The second input
column, which has been assigned to the variable
In this example in MySQL Shell's JavaScript mode, the input columns from the data file are both assigned to variables, then transformed in various ways and used to populate the columns of the target table: skipRows:
Skip this number of rows of data at the beginning of the import file, or in the case of multiple import files, at the beginning of every file included in the file list. You can use this option to omit an initial header line containing column names from the upload to the table. The default is that no rows are skipped. replaceDuplicates: [true|false] Whether input rows that have the same value for a primary key or unique index as an existing row should be replaced ( dialect: [default|csv|csv-unix|tsv|json] Use a set of field- and line-handling options appropriate for the specified file format. You can use the selected dialect as a base for further customization, by also specifying one or more of the Table 11.2 Dialect settings for parallel table import utility
Note
linesTerminatedBy:
" One or more
characters (or an empty string) that terminates each of the lines in the input data file or files. The default is as for the specified dialect, or a linefeed character ( fieldsTerminatedBy:
" One or more characters (or an empty string) that terminates each of the fields in the input data file or files. The default is as for the specified dialect, or a tab character ( fieldsEnclosedBy:
"
A single character (or an empty string) that encloses each of the fields in the input data file or files. The default is as for the specified dialect, or the empty string if the dialect option is omitted. This option is equivalent to the fieldsOptionallyEnclosed: [ true | false
] Whether the character given for fieldsEscapedBy:
" The character that begins escape sequences in the input
data file or files. If this is not provided, escape sequence interpretation does not occur. The default is as for the specified dialect, or a backslash (\) if the dialect option is omitted. This option is equivalent to the characterSet:
" Added in MySQL Shell 8.0.21. This option specifies a character set encoding with which
the input data is interpreted during the import. Setting the option to bytesPerChunk:
" For a list of multiple input data files, this
option is not available. For a single input data file, this option specifies the number of bytes (plus any additional bytes required to reach the end of the row) that threads send for each threads:
The maximum number of parallel threads to use to send the data in the input file or files to the target server. If you do not specify a number of threads, the default maximum is 8. For a list of multiple input data files, the utility creates the specified or maximum number of threads. For a single input data file, the utility calculates an appropriate number of threads to create up to this maximum, using the following formula:
where Compressed files cannot be distributed into chunks, so instead the utility uses its parallel connections to upload multiple files at a time. If there is only one input data file, the upload of a compressed file can only use a single connection. maxRate:
" The maximum limit on data throughput in bytes per second per thread. Use this option if you need to
avoid saturating the network or the I/O or CPU for the client host or target server. The maximum rate can be specified as a number of bytes, or using the suffixes k (kilobytes), M (megabytes), G (gigabytes). For example, showProgress: [ true | false ] Display ( sessionInitSql: A list of SQL statements to run at the start of each client session used for loading data into the target MySQL instance. You can use this option to change session variables. This option is available from MySQL Shell 8.0.30. For example, the following statements skip binary logging on the target MySQL instance for the sessions used by the utility during the course of the import, and increase the number of threads available for index creation:
If an error occurs while running the SQL statements, the import stops and returns an error message. Options for Oracle Cloud InfrastructureMySQL Shell supports importing input data files stored in Oracle Cloud Infrastructure Object Storage buckets. Added in MySQL Shell 8.0.21. The name of the Oracle Cloud Infrastructure Object Storage bucket where the input data file is located. By default, the osNamespace:
" Added in MySQL Shell 8.0.21. The Oracle Cloud Infrastructure namespace where the Object Storage bucket named by ociConfigFile:
" Added in MySQL Shell 8.0.21. An Oracle Cloud Infrastructure CLI configuration file that contains the profile to use for the connection, instead of the one in the default location ociProfile:
" Added in MySQL Shell 8.0.21. The profile name of the Oracle Cloud Infrastructure
profile to use for the connection, instead of the Options for S3-Compatible ServicesMySQL Shell supports importing input data files stored in S3-compatible buckets, such as Amazon Web Services (AWS) S3. For information on the supported services and their configuration requirements, see Section 4.7, “Cloud Service Configuration”. Added in MySQL Shell 8.0.30. The name of the S3 bucket where the dump files are located. By default, the s3CredentialsFile:
" Added in MySQL Shell 8.0.30. A credentials
file that contains the user's credentials to use for the connection, instead of the one in the default location, s3ConfigFile:
" Added in MySQL Shell 8.0.30. An AWS CLI configuration file that contains the profile to use for the connection, instead of the one in the default location s3Profile:
" Added in MySQL Shell 8.0.30. The profile name of the s3 CLI profile to use for the connection, instead of the s3EndpointOverride:
" The URL of the endpoint to use instead of the default. Added in MySQL Shell 8.0.30. When connecting to the Oracle Cloud Infrastructure S3 compatbility
API, the endpoint takes the following format: For a namespace named axaxnpcrorw5 in the US East (Ashburn) region: How do I import a dataset into MySQL Workbench?To import a file, open Workbench and click on + next to the MySQL connections option. Fill in the fields with the connection information. Once connected to the database go to Data Import/Restore. Choose the option Import from Self-Contained File and select the file.
How do I transfer data from S3 to RDS MySQL?Follow these steps to achieve the same:. S3 to RDS Step 1: Create and attach IAM Role to RDS Cluster.. S3 to RDS Step 2: Create and Attach Parameter Group to RDS Cluster.. S3 to RDS Step 3: Reboot your RDS Instances.. S3 to RDS Step 4: Alter S3 Bucket Policy.. S3 to RDS Step 5: Establish a VPC Endpoint.. How do I retrieve data from S3 bucket?In the Amazon S3 console, choose your S3 bucket, choose the file that you want to open or download, choose Actions, and then choose Open or Download. If you are downloading an object, specify where you want to save it. The procedure for saving the object depends on the browser and operating system that you are using.
How do I connect to AWS MySQL database in workbench?Follow the steps below to connect MySQL Workbench to your Amazon RDS DB instance: Download and install MySQL Workbench. Open MySQL Workbench, and choose the ⊕ sign beside MySQL Connections to set up a new connection.. Host name: Enter the RDS endpoint.. Port: Enter Port the number.. Username: Enter the master user.. |