Dealing with really big db tables, any advice?

05.03.2022, 20:11

Any advice for dealing with very big tables in IHP? Product page here has gotten so dreadfully slow even when paginated and client-side loaded: https://dill.network/Products

There is 2 million products in the DB, so big, but I guess it's not enourmous?

Sharding seems a bit crazy to me, I suppose scaling up the db is the most viable thing to do? I scaled it up once (2 GB RAM / 1vCPU) and it helped noticeably, but still pretty slow.

marc PRO

05.03.2022, 22:01

Have you checked that all indices are set for the tables your using?

Lillo PRO

06.03.2022, 07:51

It's basically just a table looks that like this:

CREATE TABLE products (
    id UUID DEFAULT uuid_generate_v4() PRIMARY KEY NOT NULL,
    barcode TEXT NOT NULL UNIQUE,
    name TEXT NOT NULL,
    description TEXT DEFAULT NULL,
    manufacturer TEXT DEFAULT NULL,
    not_vegan_reason TEXT DEFAULT NULL,
    warning TEXT DEFAULT NULL,
    created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW() NOT NULL,
    updated_at TIMESTAMP WITH TIME ZONE DEFAULT NOW() NOT NULL,
    image_url TEXT DEFAULT '' NOT NULL,
    vegan_status veg_status DEFAULT 'not_sure' NOT NULL,
    ingredients TEXT
);

Could an index benefit this table alone somehow?

It fetches comment count and product change count, but I removed these and still slow.

And I have a search query that looks like this:

        let searchName = query @Product |> filterWhereILike (#name, "%" <> searchTerm <> "%")
        let searchManufacturer = query @Product |> filterWhereILike (#manufacturer, Just ("%" <> searchTerm <> "%"))
        let queryProducts = queryUnion searchName searchManufacturer
        (productsQ, pagination) <-
            queryProducts
                |> ( \p -> case veganStatusParam of
                        Nothing -> p
                        Just vegStatusParam ->
                            p |> filterWhere (#veganStatus, vegStatusParam)
                   )
                |> ( \p -> case orderType of
                        Ascending -> orderByAsc #updatedAt p
                        Descending -> orderByDesc #updatedAt p
                   )
                |> paginateWithOptions
                    ( defaultPaginationOptions
                        |> set #windowSize 1
                        |> set #maxItems 10
                    )
        products <- productsQ |> fetch

I guess maybe the search terms could be a performance bottleneck?

Amitai Burstein

06.03.2022, 08:13

Try adding an index on the name and manufacturer.

Lillo PRO

06.03.2022, 08:33

Amazing! So much faster now! Thanks!

Daniel Sokil PRO

06.03.2022, 08:49

You can create a one index per table, and Postgres will optimize it:

CREATE INDEX products_index ON products (id, name, manufacturer);

Lillo PRO

06.03.2022, 08:51

Really nice, thanks! Only bottleneck I have now seems to be ORDER BY. Looking into how I can do some optimization on timestamps.

Daniel Sokil PRO

06.03.2022, 08:55

Index will also speed up sorting, adding one on the updatedAt.

Daniel Sokil PRO

06.03.2022, 08:58

Another option is to increase the memory limits for Postgres, so it sorts in memory, not on disk.

Lillo PRO

06.03.2022, 09:37

Didn't get noticable improvement on adding date to regular index, but adding a BTREE index on the date really did the trick :D

CREATE INDEX products_updated_at_index ON products USING BTREE(updated_at);

Daniel Sokil PRO

08.03.2022, 02:28

Interesting, the default index type is btree, so that's strange.

Lillo PRO

08.03.2022, 08:29

It became instantly faster after running that migration, but can't say for 100% sure if the previous index was processing or something 🤷

Add Response