Asked  7 Months ago    Answers:  5   Viewed   40 times

It is safe to say that the EAV/CR database model is bad. That said,

Question: What database model, technique, or pattern should be used to deal with "classes" of attributes describing e-commerce products which can be changed at run time?

In a good E-commerce database, you will store classes of options (like TV resolution then have a resolution for each TV, but the next product may not be a TV and not have "TV resolution"). How do you store them, search efficiently, and allow your users to setup product types with variable fields describing their products? If the search engine finds that customers typically search for TVs based on console depth, you could add console depth to your fields, then add a single depth for each tv product type at run time.

There is a nice common feature among good e-commerce apps where they show a set of products, then have "drill down" side menus where you can see "TV Resolution" as a header, and the top five most common TV Resolutions for the found set. You click one and it only shows TVs of that resolution, allowing you to further drill down by selecting other categories on the side menu. These options would be the dynamic product attributes added at run time.

Further discussion:

So long story short, are there any links out on the Internet or model descriptions that could "academically" fix the following setup? I thank Noel Kennedy for suggesting a category table, but the need may be greater than that. I describe it a different way below, trying to highlight the significance. I may need a viewpoint correction to solve the problem, or I may need to go deeper in to the EAV/CR.

Love the positive response to the EAV/CR model. My fellow developers all say what Jeffrey Kemp touched on below: "new entities must be modeled and designed by a professional" (taken out of context, read his response below). The problem is:

  • entities add and remove attributes weekly
    (search keywords dictate future attributes)
  • new entities arrive weekly
    (products are assembled from parts)
  • old entities go away weekly
    (archived, less popular, seasonal)

The customer wants to add attributes to the products for two reasons:

  • department / keyword search / comparison chart between like products
  • consumer product configuration before checkout

The attributes must have significance, not just a keyword search. If they want to compare all cakes that have a "whipped cream frosting", they can click cakes, click birthday theme, click whipped cream frosting, then check all cakes that are interesting knowing they all have whipped cream frosting. This is not specific to cakes, just an example.

 Answers

61

There's a few general pros and cons I can think of, there are situations where one is better than the other:

Option 1, EAV Model:

  • Pro: less time to design and develop a simple application
  • Pro: new entities easy to add (might even be added by users?)
  • Pro: "generic" interface components
  • Con: complex code required to validate simple data types
  • Con: much more complex SQL for simple reports
  • Con: complex reports can become almost impossible
  • Con: poor performance for large data sets

Option 2, Modelling each entity separately:

  • Con: more time required to gather requirements and design
  • Con: new entities must be modelled and designed by a professional
  • Con: custom interface components for each entity
  • Pro: data type constraints and validation simple to implement
  • Pro: SQL is easy to write, easy to understand and debug
  • Pro: even the most complex reports are relatively simple
  • Pro: best performance for large data sets

Option 3, Combination (model entities "properly", but add "extensions" for custom attributes for some/all entities)

  • Pro/Con: more time required to gather requirements and design than option 1 but perhaps not as much as option 2 *
  • Con: new entities must be modelled and designed by a professional
  • Pro: new attributes might be easily added later on
  • Con: complex code required to validate simple data types (for the custom attributes)
  • Con: custom interface components still required, but generic interface components may be possible for the custom attributes
  • Con: SQL becomes complex as soon as any custom attribute is included in a report
  • Con: good performance generally, unless you start need to search by or report by the custom attributes

* I'm not sure if Option 3 would necessarily save any time in the design phase.

Personally I would lean toward option 2, and avoid EAV wherever possible. However, for some scenarios the users need the flexibility that comes with EAV; but this comes with a great cost.

Tuesday, June 1, 2021
 
ShadowZzz
answered 7 Months ago
63

You could have a design like:

 +---------------+     +-------------------+
 | PRODUCTS      |-----< PRODUCT_VARIANTS  |
 +---------------+     +-------------------+
 | #product_id   |     | #product_id       |
 |  product_name |     | #variant_id       |
 +---------------+     |  sku_id           |
         |             +-------------------+
         |                       |
+--------^--------+     +--------^--------+
| PRODUCT_OPTIONS |-----< VARIANT_VALUES  |
+-----------------+     +-----------------+
| #product_id     |     | #product_id     |
| #option_id      |     | #variant_id     |
+--------v--------+     | #option_id      |
         |              |  value_id       |
+-----------------+     +--------v--------+
| OPTIONS         |              |
+-----------------+              |
| #option_id      |              |
|  option_name    |              |
+-----------------+              |
         |                       |
 +-------^-------+               |
 | OPTION_VALUES |---------------+
 +---------------+
 | #option_id    |
 | #value_id     |
 |  value_name   |
 +---------------+

With the Primary, Unique and Foreign Keys:

  • PRODUCTS
    • PK: product_id
    • UK: product_name
  • OPTIONS
    • PK: option_id
    • UK: option_name
  • OPTION_VALUES
    • PK: option_id, value_id
    • UK: option_id, value_name
    • FK: option_id REFERENCES OPTIONS (option_id)
  • PRODUCT_OPTIONS
    • PK: product_id, option_id
    • FK: product_id REFERENCES PRODUCTS (product_id)
    • FK: option_id REFERENCES OPTIONS (option_id)
  • PRODUCT_VARIANTS
    • PK: product_id, variant_id
    • UK: sku_id
    • FK: product_id REFERENCES PRODUCTS (product_id)
  • VARIANT_VALUES
    • PK: product_id, variant_id, option_id
    • FK: product_id, variant_id REFERENCES PRODUCT_VARIANTS (product_id, variant_id)
    • FK: product_id, option_id REFERENCES PRODUCT_OPTIONS (product_id, option_id)
    • FK: option_id, value_id REFERENCES OPTION_VALUES (option_id, value_Id)

You have:

  • PRODUCTS e.g. Shirt, Jumper, Trousers
  • OPTIONS e.g. Size, Colour, Length
  • OPTION_VALUES e.g Size - Small, Medium, Large; Colour - Red, White, Blue
  • PRODUCT_OPTIONS e.g. Shirt - Size, Colour; Trousers - Length, Colour

You then need to create a n-dimensional array, with the number of dimensions equalling the number of options for the product. Each element in the array corresponds to a product variant. There will always be at least one product variant for each product; as there is always the pseudo option of the product "as-is"

  • PRODUCT_VARIANTS e.g Shirt 1, Shirt 2
  • VARIANT_VALUES e.g Shirt 1: Small Red; Shirt 2: Small White

You may wish to have validation to ensure a SKU is not assigned unless values have been specified for all options associated with a product.

Based on the spreadsheet of how you see your data you could enter data in your tables as follows:

PRODUCTS
========
id  name
--- --------
1   Widget 1
2   Widget 2
3   Widget 3
 
PRODUCT_VARIANTS
================
id  product_id name
--- ---------- ------
1   1          Size   (Widget 1)
2   1          Color  (Widget 1)
3   2          Size   (Widget 2)
4   3          Class  (Widget 3)
5   3          Size   (Widget 3)
 
PRODUCT_VARIANT_OPTIONS
=======================
id  product_variant_id name
--- ------------------ -------------
1   1                  Small         (Widget 1; Size)
2   1                  Large         (Widget 1; Size)
3   2                  White         (Widget 1; Color)
4   2                  Black         (Widget 1; Color)
5   3                  Small         (Widget 2; Size)
6   3                  Medium        (Widget 2; Size)
7   4                  Amateur       (Widget 3; Class)
8   4                  Professional  (Widget 3; Class)
9   5                  Medium        (Widget 3; Size)
10  5                  Large         (Widget 3; Size)
 
SKUS
====
id  product_id sku    price
--- ---------- ------ -----
1   1          W1SSCW    10 (Widget 1)
2   1          W1SSCB    10 (Widget 1)
3   1          W1SLCW    12 (Widget 1)
4   1          W1SLCB    15 (Widget 1)
5   2          W2SS     100 (Widget 2)
6   2          W2SM     100 (Widget 2)
7   3          W3CASM    50 (Widget 3)
8   3          W3CASL    50 (Widget 3)
9   3          W3CPSM   150 (Widget 3)
10  3          W3CPSL   160 (Widget 3)
 
PRODUCT_VARIANT_OPTION_COMBINATIONS
===================================
product_variant_option_id sku_id
------------------------- ------
1                         1      (W1SSCW; Size; Small)
3                         1      (W1SSCW; Color; White)
1                         2      (W1SSCB; Size; Small)
4                         2      (W1SSCB; Color; Black)
2                         3      (W1SLCW; Size; Large)
3                         3      (W1SLCW;  Color; White)
2                         4      (W1SLCB; Size; Large)
4                         4      (W1SLCB; Color; Black)
5                         5      (W2SS; Size; Small)
6                         6      (W2SM; Size; Medium)
7                         7      (W3CASM; Class; Amateur)
9                         7      (W3CASM; Size; Medium)
7                         8      (W3CASL; Class; Amateur)
10                        8      (W3CASL; Size; Large)
8                         9      (W3CPSM; Class; Professional)
9                         9      (W3CPSM; Size; Medium)
8                         10     (W3CPSL; Class; Professional)
10                        10     (W3CPSL; Size; Large)

There seems to be nothing in your design from stopping the addition of the entry of the record (product_variant_option_id: 2; sku_id 1) so that SKU W1SSCW has now both the options of Small and Large. There is nothing to stop the entry of the record (product_variant_option_id: 7; sku_id: 1) so that SKU W1SSCW also has the option Amateur.

Based on the spreadsheet of how you see your data you could enter data in my tables as follows:

PRODUCTS
========
product_id product_name
---------- ------------
1          Widget 1
2          Widget 2
3          Widget 3
 
OPTIONS
=======
option_id option_name
--------- -----------
1         Size SL
2         Color
3         Size SM
4         Class
5         Size ML
 
OPTION_VALUES
=============
option_id value_id value_name
--------- -------- ------------
1         1        Small        (Size SL)
1         2        Large        (Size SL)
2         1        White        (Color)
2         2        Black        (Color)
3         1        Small        (Size SM)
3         2        Medium       (Size SM)
4         1        Amateur      (Class)
4         2        Professional (Class)
5         1        Medium       (Size ML)
5         2        Large        (Size ML)
 
PRODUCT_OPTIONS
===============
product_id option_id
---------- ---------
1          1         (Widget 1; Size SL)
1          2         (Widget 1; Color)
2          3         (Widget 2; Size SM)
3          4         (Widget 3; Class)
3          5         (Widget 4; Size ML)
 
PRODUCT_VARIANTS
================
product_id variant_id sku_id
---------- ---------- ------
1          1          W1SSCW (Widget 1)
1          2          W1SSCB (Widget 1)
1          3          W1SLCW (Widget 1)
1          4          W1SLCB (Widget 1)
2          1          W2SS   (Widget 2)
2          2          W2SM   (Widget 2)
3          1          W3CASM (Widget 3)
3          2          W3CASL (Widget 3)
3          3          W3CPSM (Widget 3)
3          4          W3CPSL (Widget 3)
 
VARIANT_VALUES
==============
product_id variant_id option_id value_id
---------- ---------- --------- --------
1          1          1         1        (W1SSCW; Size SL; Small)
1          1          2         1        (W1SSCW; Color; White)
1          2          1         1        (W1SSCB; Size SL; Small)
1          2          2         2        (W1SSCB; Color; Black)
1          3          1         2        (W1SLCW; Size SL; Large)
1          3          2         1        (W1SLCW; Color; White)
1          4          1         2        (W1SLCB; Size SL; Large)
1          4          2         2        (W1SLCB; Color; Black)
2          1          3         1        (W2SS; Size SM; Small)
2          2          3         2        (W2SM; Size SM; Medium)
3          1          4         1        (W3CASM; Class; Amateur)
3          1          5         1        (W3CASM; Size ML; Medium)
3          2          4         1        (W3CASL; Class; Amateur)
3          2          5         2        (W3CASL; Size ML; Large)
3          3          4         2        (W3CPSM; Class; Professional)
3          3          5         1        (W3CPSM; Size ML; Medium)
3          4          4         2        (W3CPSL; Class; Professional)
3          4          5         2        (W3CPSL; Size ML; Large)

In my design you could not enter the additional VARIANT_VALUES record (product_id: 1; variant_id: 1; option_id: 1; value_id: 2) - so that SKU W1SSCW has now both the options of Small and Large - due to the primary key on VARIANT_VALUES and the existing VARIANT_VALUES record (product_id: 1; variant_id: 1; option_id: 1; value_id: 1). In my design you could not enter the VARIANT_VALUES record (product_id: 1; variant_id: 1; option_id: 4; value_id: 1) - so that SKU W1SSCW also has the option Amateur - due to the foreign key referencing PRODUCT_OPTIONS and the lack of a record in this table of (product_id: 1; option_id: 4) indicating that Class is a valid option for product Widget 1.

EDIT: Design with no PRODUCT_OPTIONS table

You could have a design like:

+---------------+     +---------------+
| PRODUCTS      |-----< PRODUCT_SKUS  |
+---------------+     +---------------+
| #product_id   |     | #product_id   |
|  product_name |     | #sku_id       |
+---------------+     |  sku          |
        |             |  price        |
        |             +---------------+
        |                     |
+-------^-------+      +------^------+
| OPTIONS       |------< SKU_VALUES  |
+---------------+      +-------------+
| #product_id   |      | #product_id |
| #option_id    |      | #sku_id     |
|  option_name  |      | #option_id  |
+---------------+      |  value_id   |
        |              +------v------+
+-------^-------+             |
| OPTION_VALUES |-------------+
+---------------+
| #product_id   |
| #option_id    |
| #value_id     |
|  value_name   |
+---------------+

With the Primary, Unique and Foreign Keys:

  • PRODUCTS
    • PK: product_id
    • UK: product_name
  • OPTIONS
    • PK: product_id, option_id
    • UK: product_id, option_name
  • OPTION_VALUES
    • PK: product_id, option_id, value_id
    • UK: product_id, option_id, value_name
    • FK: product-id, option_id REFERENCES OPTIONS (product_id, option_id)
  • PRODUCT_SKUS
    • PK: product_id, sku_id
    • UK: sku_id
    • FK: product_id REFERENCES PRODUCTS (product_id)
  • SKU_VALUES
    • PK: product_id, sku_id, option_id
    • FK: product_id, sku_id REFERENCES PRODUCT_SKUS (product_id, sku_id)
    • FK: product_id, option_id REFERENCES OPTIONS (product_id, option_id)
    • FK: product_id, option_id, value_id REFERENCES OPTION_VALUES (product_id, option_id, value_id)

Based on the spreadsheet of how you see your data you could enter data in these tables as follows:

PRODUCTS
========
product_id product_name
---------- ------------
1          Widget 1
2          Widget 2
3          Widget 3
 
OPTIONS
=======
product_id option_id option_name
---------- --------- -----------
1          1         Size        (Widget 1)
1          2         Color       (Widget 1)
2          1         Size        (Widget 2)
3          1         Class       (Widget 3)
3          2         Size        (Widget 3)
 
OPTION_VALUES
=============
product_id option_id value_id value_name
---------- --------- -------- ------------
1          1         1        Small        (Widget1; Size)
1          1         2        Large        (Widget1; Size)
1          2         1        White        (Widget1; Color)
1          2         2        Black        (Widget1; Color)
2          1         1        Small        (Widget2; Size)
2          1         2        Medium       (Widget2; Size)
3          1         1        Amateur      (Widget3; Class)
3          1         2        Professional (Widget3; Class)
3          2         1        Medium       (Widget3; Size)
3          2         2        Large        (Widget3; Size)
 
PRODUCT_SKUS
============
product_id sku_id sku
---------- ------ ------
1          1      W1SSCW (Widget 1)
1          2      W1SSCB (Widget 1)
1          3      W1SLCW (Widget 1)
1          4      W1SLCB (Widget 1)
2          1      W2SS   (Widget 2)
2          2      W2SM   (Widget 2)
3          1      W3CASM (Widget 3)
3          2      W3CASL (Widget 3)
3          3      W3CPSM (Widget 3)
3          4      W3CPSL (Widget 3)
 
SKU_VALUES
==========
product_id sku_id option_id value_id
---------- ------ --------- --------
1          1      1         1        (W1SSCW; Size; Small)
1          1      2         1        (W1SSCW; Color; White)
1          2      1         1        (W1SSCB; Size; Small)
1          2      2         2        (W1SSCB; Color; Black)
1          3      1         2        (W1SLCW; Size; Large)
1          3      2         1        (W1SLCW; Color; White)
1          4      1         2        (W1SLCB; Size; Large)
1          4      2         2        (W1SLCB; Color; Black)
2          1      1         1        (W2SS; Size; Small)
2          2      1         2        (W2SM; Size; Medium)
3          1      1         1        (W3CASM; Class; Amateur)
3          1      2         1        (W3CASM; Size; Medium)
3          2      1         1        (W3CASL; Class; Amateur)
3          2      2         2        (W3CASL; Size; Large)
3          3      1         2        (W3CPSM; Class; Professional)
3          3      2         1        (W3CPSM; Size; Medium)
3          4      1         2        (W3CPSL; Class; Professional)
3          4      2         2        (W3CPSL; Size; Large)
Thursday, June 3, 2021
 
Savageman
answered 6 Months ago
57

You're making a common mistake of database design, storing name in one column and value in another column. This is not a relational database design.

Each attribute should be named by the column name. Color, pages, shirt size, publish date, should be column names.

If each product type has a distinct set of attributes, there are other solutions. See my answers to:

  • Product table, many kinds of product, each product has many parameters for details.
  • How do you model custom attributes of entities?
  • Design question: Filterable attributes, SQL
  • How to design a database schema to support tagging with categories?
  • How to define structure in a tag-based organization?

Also please read this story: Bad CaRMa: Introducing Vision before you implement a database designed around name-value pairs as you are doing.

Sunday, August 8, 2021
 
DMTintner
answered 4 Months ago
51

Short answer - yes, a reporting database is a reasonable approach to solving the problems of reporting from an EAV data model.

I spent a number of years working with an information management solution which allowed end users complete freedom to define their own data model, with both the schema and the data stored using an EAV model. Interestingly, this product provided meta-schema objects used to fulfill reporting requirements (e.g. graphs to provide object navigation, views to perform projection, etc.). This meant that the end user was free to define queries using the same terms and concepts that they'd used to build the data model in the first instance. The act of reporting was essentially to compute the data set by navigating these definitions, and hand the result over to a traditional report writing tool as if it were relational data.

One of the strengths of this approach was that the same mechanism that was already in place to transform the EAV model to something the user could work with could be reused and applied to the reporting function.

Wednesday, August 11, 2021
 
abyss.7
answered 4 Months ago
71

store user defined data in SQL

I think you are looking for the Entity–attribute–value database model in which:

The basic idea is to store attributes, and their corresponding values, as rows in a single table.

Typically the table has at least three columns: entity, attribute, and value. Though if there is only a single relevant entity, e.g. a table for application configuration or option settings, the entity column can be excluded.

See this pages as a start:

  • Using Database Metadata and its Semantics to Generate Automatic and Dynamic Web Entry Forms (pdf)

  • Planning and Implementing a Metadata-Driven Digital Repository (pdf)

I retagged your question with entity-attribute-value tag, in which you can browse a lot of threads that relate to your case.

Tuesday, August 31, 2021
 
Grokodile
answered 3 Months ago
Only authorized users can answer the question. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :
 
Share