添加链接
link之家
链接快照平台
  • 输入网页链接,自动生成快照
  • 标签化管理网页链接
Collectives™ on Stack Overflow

Find centralized, trusted content and collaborate around the technologies you use most.

Learn more about Collectives

Teams

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Learn more about Teams

What is the proper indexing for this query.

I tried given different combinations of indexes for this query but it is still using from using tempory , using filesort etc.

Total table data - 7,60,346

product = 'Dresses' - Total rows = 122 554

CREATE TABLE IF NOT EXISTS `product_data` (
  `table_id` int(11) NOT NULL AUTO_INCREMENT,
  `id` int(11) NOT NULL,
  `price` int(11) NOT NULL,
  `store` varchar(255) NOT NULL,
  `brand` varchar(255) DEFAULT NULL,
  `product` varchar(255) NOT NULL,
  `model` varchar(255) NOT NULL,
  `size` varchar(50) NOT NULL,
  `discount` varchar(255) NOT NULL,
  `gender_id` int(11) NOT NULL,
  `availability` int(11) NOT NULL,
  PRIMARY KEY (`table_id`),
  UNIQUE KEY `table_id` (`table_id`),
  KEY `id` (`id`),
  KEY `discount` (`discount`),
  KEY `step_one` (`product`,`availability`),
  KEY `step_two` (`product`,`availability`,`brand`,`store`),
  KEY `step_three` (`product`,`availability`,`brand`,`store`,`id`),
  KEY `step_four` (`brand`,`store`),
  KEY `step_five` (`brand`,`store`,`id`)
) ENGINE=InnoDB ;

Query :

SELECT id ,store,brand FROM `product_data` WHERE product='dresses' and 
availability='1' group by brand,store order by store limit 10;
  

excu..time :- (10 total, Query took 1.0941 sec)

EXPLAIN PLAN :

possible_keys :- step_one, step_two, step_three, step_four, step_five

key :- step_two

ref :- const,const

rows :- 229438

Extra :-Using where; Using temporary; Using filesort

I tried these indexes

Key step_one (product,availability)

Key step_two (product,availability,brand,store)

Key step_three (product,availability,brand,store,id)

Key step_four (brand,store)

Key step_five (brand,store,id)

@kuldeepupadhyay please give us the time results for each of the indexes combination , you've mentionned – Andy K Sep 29, 2017 at 10:21 they pick automic step-two i think this is best combination , but still time more and using tempory ,using filesort – kuldeep upadhyay Sep 29, 2017 at 10:27

The real problem is not the index, but the mismatch between GROUP BY and ORDER BY preventing taking advantage of LIMIT.

INDEX(product, availability, store, brand, id)

will be "covering" and in the right order. But note that I have swapped store and brand...

Change the query to

SELECT  id ,store,brand
    FROM  `product_data`
    WHERE  product='dresses'
      and  availability='1'
    GROUP BY store, brand    -- change
    ORDER BY store, brand    -- change
    limit  10;

That changes the GROUP BY to start with store, to reflect the ORDER BY ordering -- this avoid an extra sort. And it changes the ORDER BY to be identical to the GROUP BY so that the two can be combined.

Given those changes, the INDEX can now go all the way through to the LIMIT, thereby allowing the processing to look at only 10 rows, not a much larger set.

Anything less than all these changes will not be as efficient.

Further discussion:

INDEX(product, availability,   -- these two can be in either order
      store, brand,      -- must match both `GROUP BY` and `ORDER BY`
      id)   -- tacked on (on the end) to make it "covering"

"Covering" means that all the columns for the SELECT are found in the INDEX, so no need to reach over into the data.

But... The whole query does not make sense because of the inclusion of id in the SELECT. If you want to find what stores have available dresses, then get rid of id. If you want to list all the available dresses, then change id to GROUP_CONCAT(id).

For the indexes, the best index is the step_two. The product field is used in where and has more variation than the availability-field.

Couple of notes about the query:

  • availability='1' should be availability=1 so that needless int->varchar conversion would be avoided.
  • "group by brand" should not be used as GROUP BY should only be used when you use aggregate functions as selected columns. What as it that you were trying to achieve with the group by?
  • Your group by clause doesn't really make sense without an aggregate function.

    If you can re-write the query to

    SELECT id ,store 
    FROM `product_data` 
    WHERE product='dresses' 
    and availability='1' 
    order by store limit 10;
    

    Then an index on (product,availability,store) will remove all filesorts.

    See SQLFiddle: http://sqlfiddle.com/#!9/60f33d/2

    UPDATE:

    The SQLFiddle makes your intention clear - you're using GROUP BY to simulate DISTINCT. I don't think you can get rid of the filesort and temporary table steps in your query if this is the case - but I also don't think those steps should be hugely expensive.

    GROUP BY has the effect of DISTINCT in the absence of aggregates. But DISTINCT will stumble over the unique id. – Rick James Sep 29, 2017 at 18:18

    Thanks for contributing an answer to Stack Overflow!

    • Please be sure to answer the question. Provide details and share your research!

    But avoid

    • Asking for help, clarification, or responding to other answers.
    • Making statements based on opinion; back them up with references or personal experience.

    To learn more, see our tips on writing great answers.