Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

varchar types inferred on schema import are set to size 1023. 18 occurrences overflow row size limit #8972

Open
timsehn opened this issue Mar 11, 2025 · 1 comment
Labels

Comments

@timsehn
Copy link
Contributor

timsehn commented Mar 11, 2025

Repro:

  1. Grab file: https://drive.google.com/file/d/14rjb9RwVNTYPdfscjPyY3mvwuxI0TaGA/view?usp=sharing
  2. make a new dolt db
  3. Run
$ dolt table import -c --pk steam_appid steam_games_released cleaned_released_games_feb_2025.csv 

An error occurred while moving data
cause: invalid table spec: expected size < 65504, found 73728
Errors during import can be ignored using '--continue'
  1. Try schema import it works:
$ dolt schema import -c --pks steam_appid steam_games_released cleaned_released_games_feb_2025.csv          
CREATE TABLE `steam_games_released` (
  `name` varchar(1023),
  `steam_appid` int NOT NULL,
  `required_age` int,
  `controller_support` varchar(1023),
  `supported_languages` varchar(1023),
  `developers` varchar(1023),
  `publishers` varchar(1023),
  `platforms` varchar(1023),
  `categories` varchar(1023),
  `genres` varchar(1023),
  `release_date` varchar(1023),
  `followers` int,
  `estimated_wishlists` int,
  `tags` varchar(1023),
  `price` varchar(1023),
  `estimated_revenue` varchar(1023),
  `currency` varchar(1023),
  `owners` varchar(1023),
  `average_forever` int,
  `average_2weeks` int,
  `median_forever` int,
  `median_2weeks` int,
  `concurrent_users` int,
  `total_positive` varchar(1023),
  `total_negative` varchar(1023),
  `total_reviews` varchar(1023),
  `status` varchar(1023),
  PRIMARY KEY (`steam_appid`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_bin;
Created table successfully.
  1. Follow with a table import...there are some bad rows:
$ dolt table import -u --pk steam_appid steam_games_released cleaned_released_games_feb_2025.csv

An error occurred while moving data
cause: error: 'undefined' is not a valid value for 'int'
       A bad row was encountered inserting into table steam_games_released (on line 36255):
       	name: Juro Janosik
       	steam_appid: 978630
       	required_age: 0
       	controller_support: full
       	supported_languages: English, Czech, Polish, Spanish - Spain, Italian, French, German, Dutch, Portuguese - Brazil, Greek, Korean, Simplified Chinese, Thai, Traditional Chinese, Turkish, Hungarian, Russian, Ukrainian, Slovak
       	developers: Peter Jurkovsky
       	publishers: Black Deer Games
       	platforms: windows,mac,linux
       	categories: Single-player,Steam Achievements,Full controller support,Steam Cloud,Family Sharing
       	genres: Action,Adventure,Indie,RPG
       	release_date: 1/25/2021
       	followers: 854
       	estimated_wishlists: 10248
       	tags: Top-Down,Isometric,Indie,Adventure,Puzzle,Singleplayer,3D,Funny,Action,Puzzle Platformer,Loot,Casual,Relaxing,Combat,Controller,RPG,Historical,Hack and Slash,3D Platformer,Cute
       	price: 10.99
       	estimated_revenue: 20401
       	currency: USD
       	owners: undefined
       	average_forever: undefined
       	average_2weeks: undefined
       	median_forever: undefined
       	median_2weeks: undefined
       	concurrent_users: undefined
       	total_positive: 74
       	total_negative: 4
       	total_reviews: 78
       	status: released
       
Errors during import can be ignored using '--continue'

Why did the table import make a table too large and schema import did not?

@timsehn timsehn added bug Something isn't working cli import customer issue labels Mar 11, 2025
@timsehn
Copy link
Contributor Author

timsehn commented Apr 1, 2025

I think a good temporary solution here is to infer text columns as varchar(200) instead of varchar(1023).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant