[webkit-reviews] review denied: [Bug 195088] Implement Telemetry and Dumping Routines for SQLite backend : [Attachment 377623] Patch

bugzilla-daemon at webkit.org bugzilla-daemon at webkit.org
Thu Aug 29 14:42:00 PDT 2019


Sam Weinig <sam at webkit.org> has denied Katherine_cheney at apple.com's request for
review:
Bug 195088: Implement Telemetry and Dumping Routines for SQLite backend
https://bugs.webkit.org/show_bug.cgi?id=195088

Attachment 377623: Patch

https://bugs.webkit.org/attachment.cgi?id=377623&action=review




--- Comment #6 from Sam Weinig <sam at webkit.org> ---
Comment on attachment 377623
  --> https://bugs.webkit.org/attachment.cgi?id=377623
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=377623&action=review

Thanks for working on this. 

I'm a little concerned with the amount of string concatenation going on around
the preparation of the SQLiteStatements. This is traditionally something to be
avoided in order to reduce the chance for sql injection attacks.

> Source/WebKit/ChangeLog:10
> +	   Implemented database telemetry calculating for ITP. Mimicked
ResourceLoadStatisticsMemoryStore
> +	   telemetry logging behavior using SQLite Queries as opposed to vector
sorting/manipulation.

It would be good to explain the motivation for this.

Also, it looks like this doesn't have any new tests associated with it. Is this
code path already tested?

> Source/WebKit/ChangeLog:27
> +	   * NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp:
> +	  
(WebKit::ResourceLoadStatisticsDatabaseStore::ResourceLoadStatisticsDatabaseSto
re):
> +	   (WebKit::ResourceLoadStatisticsDatabaseStore::prepareStatements):
> +	  
(WebKit::ResourceLoadStatisticsDatabaseStore::getMedianOfPrevalentResourcesWith
UserInteraction const):
> +	  
(WebKit::ResourceLoadStatisticsDatabaseStore::getNumberOfPrevalentResources
const):
> +	  
(WebKit::ResourceLoadStatisticsDatabaseStore::getNumberOfPrevalentResourcesWith
UI const):
> +	  
(WebKit::ResourceLoadStatisticsDatabaseStore::getTopPrevelentResourceDaysSinceU
I const):
> +	  
(WebKit::ResourceLoadStatisticsDatabaseStore::getMedianStatisticOfPrevalentReso
urceWithoutUserInteraction const):
> +	  
(WebKit::ResourceLoadStatisticsDatabaseStore::getNumberOfPrevalentResourcesInTo
pResources const):
> +	   (WebKit::ResourceLoadStatisticsDatabaseStore::calculateTelemetryData
const):
> +	  
(WebKit::ResourceLoadStatisticsDatabaseStore::calculateAndSubmitTelemetry
const):
> +	   * NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.h:
> +	   * NetworkProcess/Classifier/WebResourceLoadStatisticsTelemetry.cpp:
> +	   (WebKit::databaseSubmitTopLists):
> +	   (WebKit::WebResourceLoadStatisticsTelemetry::submitTelemetry):
> +	   * NetworkProcess/Classifier/WebResourceLoadStatisticsTelemetry.h:

Please fill in the per-function/per-file parts of the ChangeLog as well. Please
include both what is changing and why the change is being made.

>
Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp
:518
> +    return SQLiteStatement(database, makeString("SELECT
mostRecentUserInteractionTime FROM ObservedDomains INNER JOIN (SELECT ", pre, "
AND hadUserInteraction = 1 ", post, ") as q ON ObservedDomains.domainID =
q.domainID LIMIT 1 OFFSET ", String::number(count)));

The String::number() shouldn't be needed. You can pass count directly to
makeString().

>
Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp
:527
> +// 1: subframe
> +// 2: subresource
> +// 3: unique redirects
> +// 4: number of times data records were removed
> +// 5: number of times accessed as first party due to user interaction
> +// 6: number of times accessed as first party due to storage access api
> +static SQLiteStatement makeMedianWithoutUIQuery(SQLiteDatabase& database,
int offset, int statistic, const String& pre, const String& post)

int statistic should probably be an enum. Then, this comment would not be
necessary.

>
Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp
:530
> +    String preamble;
> +    String end;

These can be StringViews or const char*.

>
Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp
:534
> +    case 1: preamble = "SELECT cnt1 FROM ObservedDomains o INNER JOIN(SELECT
cnt1, ";
> +	   end = ") as q ON o.domainID = q.domainID LIMIT 1 OFFSET ";

preamble = ... should go on the line after the case statement.

I think you should also consider pull out this switch statement into it's own
static function that returns a std::pair<StringView, StringView> or
std::pair<const char*, const char*>. Then you can call that function and use
structured binding to do something like:

auto& [preamble, end] = statementPartsForStatistic(statistic);

>
Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp
:552
> +    default:
> +	  
LOG_ERROR("ResourceLoadStatisticsDatabaseStore::getMedianStatisticOfPrevalentRe
sourceWithoutUserInteraction only supports statistics 1-6. Undetermined Query
behavior will result.");

This should be a RELEASE_ASSERT_NOT_REACHED(); Or, if you turn statistic into a
enum, you can remove this entirely.

>
Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp
:555
> +    return SQLiteStatement(database, makeString(preamble, pre, " AND
hadUserInteraction = 0 ", post, end, String::number(offset)));

String::number() should not be needed.

>
Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp
:588
> +    return (m_countPrevalentResourcesStatement.step() == SQLITE_ROW) ?
m_countPrevalentResourcesStatement.getColumnInt(0) : -1;

Returning -1 in a function that returns an unsigned seems wrong. I am guessing
getColumnInt also returns an int, not an unsigned. This should probably return
an int. You should also consider using Optional<> or WTF::notFound (which is
just a named constant for -1) instead of -1.

>
Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp
:593
> +    return (m_countPrevalentResourcesWithUserInteractionStatement.step() ==
SQLITE_ROW) ?
m_countPrevalentResourcesWithUserInteractionStatement.getColumnInt(0) : -1;

Same comment as above.

>
Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp
:596
> +unsigned
ResourceLoadStatisticsDatabaseStore::getTopPrevelentResourceDaysSinceUI(const
String& pre, const String& post) const

pre and post are always the same here since there is only one caller. Do you
imagine other callers in the future with different pre and post? If not,
probably best to remove the parameters and either inline pre and post, or if
they need to be in a shared location, move them into their own static
functions. It would also be nice to give them slightly clearer names than pre
and post, since while that does tell me where in the statement they go, it
doesn't tell me what they do.

>
Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp
:611
> +unsigned
ResourceLoadStatisticsDatabaseStore::getMedianStatisticOfPrevalentResourceWitho
utUserInteraction(int range, int statistic, int
numberOfPrevalentResourcesWithoutUI, const String& pre, const String& post)
const

Same points about pre/post as above.

>
Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.cpp
:639
> +    SQLiteStatement prevalentResourceCountInTop(m_database,
makeString("SELECT COUNT(*) FROM (SELECT * FROM ObservedDomains o INNER
JOIN(SELECT ", pre, " ", post, ") as q on q.domainID = o.domainID LIMIT ",
String::number(range), ") as p WHERE p.hadUserInteraction = 1;"));

I think the idiomatic way of doing something like this to create one cached
SQLiteStatement and then binding the changing variable using a bind function
(e.g. SQLiteStatement::bindInt()). In this case, that would be range. It looks
like prevalentResourceCount is unused.

Same points about pre/post as above.

>
Source/WebKit/NetworkProcess/Classifier/ResourceLoadStatisticsDatabaseStore.h:5
9
> +    unsigned numberOfPrevalentResourcesWithUITop1;
> +    unsigned numberOfPrevalentResourcesWithUITop3;
> +    unsigned numberOfPrevalentResourcesWithUITop10;
> +    unsigned numberOfPrevalentResourcesWithUITop50;
> +    unsigned numberOfPrevalentResourcesWithUITop100;

If these were std::tuples, or a new struct (e.g. struct Foo { unsigned UITop1;
...unsigned UITop100; };) I think this would be easier to work with. A bunch of
the duplicated code in
ResourceLoadStatisticsDatabaseStore::calculateTelemetryData and
databaseSubmitTopLists could be turned into loops.

>
Source/WebKit/NetworkProcess/Classifier/WebResourceLoadStatisticsTelemetry.h:30
> +#include "ResourceLoadStatisticsDatabaseStore.h"

This shouldn't be needed if you forward declare
PrevalentResourceDatabaseTelemetry.


More information about the webkit-reviews mailing list