Monday, April 29, 2024
HomeMatlabPredicting Well timed Analysis of Metastatic Breast Most cancers for the WiDS...

Predicting Well timed Analysis of Metastatic Breast Most cancers for the WiDS Datathon 2024


In at this time’s weblog, Grace Woolson will present how you should use MATLAB and machine studying to make significant deductions from healthcare information for sufferers who’ve been recognized with metastatic breast most cancers. Over to you Grace!

Introduction

On this weblog, I’ll present how you should use MATLAB for the WiDS Datathon 2024 utilizing the dataset for the WiDS Datathon #1, which runs from January ninth 2024 – March 1st 2024. This problem duties contributors with making a mannequin that may predict whether or not or not a affected person with metastatic breast most cancers will obtain a analysis inside 90 days primarily based on affected person and environmental information. This can assist establish relationships between demographics or environmental hazards with the chance of getting well timed remedy. Please word that this weblog relies on a subset of the information and there could also be slight variations between this dataset and the one offered by WiDS.
MathWorks is pleased to help contributors of the Girls in Knowledge Science Datathon 2024 by offering complimentary MATLAB licenses, tutorials, workshops, and extra sources. To request complimentary licenses for you and your teammates, go to this MathWorks web site, click on the “Request Software program” button, and fill out the software program request kind.
This tutorial will stroll via the next steps of the model-making course of:
  1. Importing a Tabular Dataset
  2. Preprocessing the Knowledge
  3. Exploring and Analyzing Tabular Knowledge
  4. Selecting and Creating Options
  5. Coaching a Machine Studying Mannequin
  6. Evaluating a Machine Studying Mannequin
  7. Making New Predictions and Exporting Submissions

Import Knowledge

First, ensure the ‘Present Folder’ is the folder the place you saved the information. When you’ve got not already accomplished so, you possibly can obtain the information from Kaggle after you register for the datathon. The information is offered as a .CSV file, so we will use the readtable perform to import the entire file as a desk.
dataFolder = fullfile(pwd);
trainDataFilename = ‘Coaching.csv’;
allTrainData = readtable(fullfile(dataFolder, trainDataFilename))
allTrainData = 12906×83 desk
patient_id patient_race payer_type patient_state patient_zip3 patient_age patient_gender bmi breast_cancer_diagnosis_code breast_cancer_diagnosis_desc metastatic_cancer_diagnosis_code metastatic_first_novel_treatment metastatic_first_novel_treatment_type Area Division inhabitants density age_median age_under_10 age_10_to_19 age_20s age_30s age_40s age_50s age_60s age_70s age_over_80 male feminine married
1 undefined ‘MEDICAID’ ‘CA’ undefined undefined ‘F’ NaN ‘C50919’ ‘Malignant neoplasm of unsp web site of unspecified feminine breast’ ‘C7989’ ‘West’ ‘Pacific’ undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
2 undefined ‘White’ ‘COMMERCIAL’ ‘CA’ undefined undefined ‘F’ undefined ‘C50411’ ‘Malig neoplm of upper-outer quadrant of proper feminine breast’ ‘C773’ ‘West’ ‘Pacific’ undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
3 undefined ‘White’ ‘COMMERCIAL’ ‘TX’ undefined undefined ‘F’ undefined ‘C50112’ ‘Malignant neoplasm of central portion of left feminine breast’ ‘C773’ ‘South’ ‘West South Central’ undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
4 undefined ‘White’ ‘COMMERCIAL’ ‘CA’ undefined undefined ‘F’ NaN ‘C50212’ ‘Malig neoplasm of upper-inner quadrant of left feminine breast’ ‘C773’ ‘West’ ‘Pacific’ undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
5 undefined ‘COMMERCIAL’ ‘ID’ undefined undefined ‘F’ NaN ‘1749’ ‘Malignant neoplasm of breast (feminine), unspecified’ ‘C773’ ‘West’ ‘Mountain’ undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
6 undefined ‘White’ ‘MEDICARE ADVANTAGE’ ‘NY’ undefined undefined ‘F’ NaN ‘1749’ ‘Malignant neoplasm of breast (feminine), unspecified’ ‘C7981’ ‘Northeast’ ‘Center Atlantic’ undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
7 undefined ‘COMMERCIAL’ ‘CA’ undefined undefined ‘F’ undefined ‘C50912’ ‘Malignant neoplasm of unspecified web site of left feminine breast’ ‘C779’ ‘West’ ‘Pacific’ undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
8 undefined ‘White’ ‘COMMERCIAL’ ‘IL’ undefined undefined ‘F’ NaN ‘C50512’ ‘Malig neoplasm of lower-outer quadrant of left feminine breast’ ‘C773’ ‘Midwest’ ‘East North Central’ undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
9 undefined ‘White’ ‘MEDICARE ADVANTAGE’ undefined undefined ‘F’ NaN ‘1744’ ‘Malignant neoplasm of upper-outer quadrant of feminine breast’ ‘C7800’ undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
10 undefined ‘COMMERCIAL’ ‘IL’ undefined undefined ‘F’ NaN ‘C50912’ ‘Malignant neoplasm of unspecified web site of left feminine breast’ ‘C773’ ‘Midwest’ ‘East North Central’ undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
11 undefined ‘MEDICARE ADVANTAGE’ ‘MI’ undefined undefined ‘F’ NaN ‘C50412’ ‘Malig neoplasm of upper-outer quadrant of left feminine breast’ ‘C799’ ‘Midwest’ ‘East North Central’ undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
12 undefined ‘MEDICARE ADVANTAGE’ ‘CA’ undefined undefined ‘F’ NaN ‘1749’ ‘Malignant neoplasm of breast (feminine), unspecified’ ‘C7800’ ‘West’ ‘Pacific’ undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
13 undefined ‘White’ ‘COMMERCIAL’ ‘MI’ undefined undefined ‘F’ NaN ‘C50812’ ‘Malignant neoplasm of ovrlp websites of left feminine breast’ ‘C781’ ‘Midwest’ ‘East North Central’ undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
14 undefined ‘IL’ undefined undefined ‘F’ undefined ‘1749’ ‘Malignant neoplasm of breast (feminine), unspecified’ ‘C779’ ‘Midwest’ ‘East North Central’ undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
I need to see some high-level statistics concerning the information, so I’ll use the abstract perform to get an concept of what sort of data we’ve.
abstract(allTrainData)

Variables:

patient_id: 12906×1 double

Values:

Min 1.0006e+05
Median 5.4352e+05
Max 9.999e+05

patient_race: 12906×1 cell array of character vectors

payer_type: 12906×1 cell array of character vectors

patient_state: 12906×1 cell array of character vectors

patient_zip3: 12906×1 double

Values:

Min 101
Median 554
Max 999

patient_age: 12906×1 double

Values:

Min 18
Median 59
Max 91

patient_gender: 12906×1 cell array of character vectors

bmi: 12906×1 double

Values:

Min 14
Median 28.19
Max 85
NumMissing 8965

breast_cancer_diagnosis_code: 12906×1 cell array of character vectors

breast_cancer_diagnosis_desc: 12906×1 cell array of character vectors

metastatic_cancer_diagnosis_code: 12906×1 cell array of character vectors

metastatic_first_novel_treatment: 12906×1 cell array of character vectors

metastatic_first_novel_treatment_type: 12906×1 cell array of character vectors

Area: 12906×1 cell array of character vectors

Division: 12906×1 cell array of character vectors

inhabitants: 12906×1 double

Values:

Min 635.55
Median 19154
Max 71374
NumMissing 1

density: 12906×1 double

Values:

Min 0.91667
Median 700.34
Max 21172
NumMissing 1

age_median: 12906×1 double

Values:

Min 20.6
Median 40.639
Max 54.57
NumMissing 1

age_under_10: 12906×1 double

Values:

Min 0
Median 11.039
Max 17.675
NumMissing 1

age_10_to_19: 12906×1 double

Values:

Min 6.3143
Median 12.924
Max 35.3
NumMissing 1

age_20s: 12906×1 double

Values:

Min 5.925
Median 12.538
Max 62.1
NumMissing 1

age_30s: 12906×1 double

Values:

Min 1.5
Median 12.443
Max 25.471
NumMissing 1

age_40s: 12906×1 double

Values:

Min 0.8
Median 12.124
Max 17.82
NumMissing 1

age_50s: 12906×1 double

Values:

Min 0
Median 13.568
Max 21.661
NumMissing 1

age_60s: 12906×1 double

Values:

Min 0.2
Median 12.533
Max 29.855
NumMissing 1

age_70s: 12906×1 double

Values:

Min 0
Median 7.3169
Max 19
NumMissing 1

age_over_80: 12906×1 double

Values:

Min 0
Median 3.8
Max 18.825
NumMissing 1

male: 12906×1 double

Values:

Min 39.725
Median 49.976
Max 61.6
NumMissing 1

feminine: 12906×1 double

Values:

Min 38.4
Median 50.024
Max 60.275
NumMissing 1

married: 12906×1 double

Values:

Min 0.9
Median 49.434
Max 66.903
NumMissing 1

divorced: 12906×1 double

Values:

Min 0.2
Median 12.653
Max 19.831
NumMissing 1

never_married: 12906×1 double

Values:

Min 13.44
Median 32.004
Max 98.9
NumMissing 1

widowed: 12906×1 double

Values:

Min 0
Median 5.5208
Max 23.055
NumMissing 1

family_size: 12906×1 double

Values:

Min 2.5504
Median 3.1665
Max 4.1723
NumMissing 4

family_dual_income: 12906×1 double

Values:

Min 19.312
Median 52.592
Max 70.925
NumMissing 4

income_household_median: 12906×1 double

Values:

Min 29222
Median 69803
Max 1.6412e+05
NumMissing 4

income_household_under_5: 12906×1 double

Values:

Min 0.75
Median 2.8382
Max 19.62
NumMissing 4

income_household_5_to_10: 12906×1 double

Values:

Min 0.36154
Median 2.1604
Max 11.872
NumMissing 4

income_household_10_to_15: 12906×1 double

Values:

Min 1.0154
Median 3.7171
Max 14.278
NumMissing 4

income_household_15_to_20: 12906×1 double

Values:

Min 1.0278
Median 3.7712
Max 12.918
NumMissing 4

income_household_20_to_25: 12906×1 double

Values:

Min 1.1
Median 4.0421
Max 14.35
NumMissing 4

income_household_25_to_35: 12906×1 double

Values:

Min 2.65
Median 8.4353
Max 18.34
NumMissing 4

income_household_35_to_50: 12906×1 double

Values:

Min 1.7
Median 11.793
Max 24.075
NumMissing 4

income_household_50_to_75: 12906×1 double

Values:

Min 4.95
Median 17.076
Max 27.13
NumMissing 4

income_household_75_to_100: 12906×1 double

Values:

Min 4.7333
Median 12.677
Max 24.8
NumMissing 4

income_household_100_to_150: 12906×1 double

Values:

Min 4.2889
Median 16.016
Max 31.325
NumMissing 4

income_household_150_over: 12906×1 double

Values:

Min 0.84
Median 14.703
Max 52.824
NumMissing 4

income_household_six_figure: 12906×1 double

Values:

Min 5.6926
Median 30.575
Max 69.032
NumMissing 4

income_individual_median: 12906×1 double

Values:

Min 4316
Median 35253
Max 88910
NumMissing 1

home_ownership: 12906×1 double

Values:

Min 15.85
Median 69.669
Max 90.367
NumMissing 4

housing_units: 12906×1 double

Values:

Min 0
Median 6994.4
Max 25923
NumMissing 1

home_value: 12906×1 double

Values:

Min 60629
Median 2.4784e+05
Max 1.8531e+06
NumMissing 4

rent_median: 12906×1 double

Values:

Min 448.4
Median 1168
Max 2965.2
NumMissing 4

rent_burden: 12906×1 double

Values:

Min 17.416
Median 30.986
Max 78.94
NumMissing 4

education_less_highschool: 12906×1 double

Values:

Min 0
Median 10.843
Max 34.325
NumMissing 1

education_highschool: 12906×1 double

Values:

Min 0
Median 27.406
Max 53.96
NumMissing 1

education_some_college: 12906×1 double

Values:

Min 7.2
Median 29.286
Max 50.133
NumMissing 1

education_bachelors: 12906×1 double

Values:

Min 2.4657
Median 19.047
Max 41.7
NumMissing 1

education_graduate: 12906×1 double

Values:

Min 2.0941
Median 10.796
Max 51.84
NumMissing 1

education_college_or_above: 12906×1 double

Values:

Min 7.0488
Median 30.141
Max 77.817
NumMissing 1

education_stem_degree: 12906×1 double

Values:

Min 23.915
Median 43.066
Max 73
NumMissing 1

labor_force_participation: 12906×1 double

Values:

Min 30.7
Median 62.778
Max 78.67
NumMissing 1

unemployment_rate: 12906×1 double

Values:

Min 0.82308
Median 5.4741
Max 18.8
NumMissing 1

self_employed: 12906×1 double

Values:

Min 2.263
Median 12.748
Max 25.538
NumMissing 4

farmer: 12906×1 double

Values:

Min 0
Median 0.45493
Max 26.729
NumMissing 4

race_white: 12906×1 double

Values:

Min 14.496
Median 70.878
Max 98.444
NumMissing 1

race_black: 12906×1 double

Values:

Min 0.060976
Median 6.4103
Max 69.66
NumMissing 1

race_asian: 12906×1 double

Values:

Min 0
Median 2.9667
Max 49.85
NumMissing 1

race_native: 12906×1 double

Values:

Min 0
Median 0.43095
Max 76.935
NumMissing 1

race_pacific: 12906×1 double

Values:

Min 0
Median 0.054054
Max 14.758
NumMissing 1

race_other: 12906×1 double

Values:

Min 0.0025641
Median 3.5136
Max 33.189
NumMissing 1

race_multiple: 12906×1 double

Values:

Min 0.43333
Median 5.802
Max 26.43
NumMissing 1

hispanic: 12906×1 double

Values:

Min 0.19444
Median 11.983
Max 91.005
NumMissing 1

disabled: 12906×1 double

Values:

Min 4.6
Median 12.884
Max 35.156
NumMissing 1

poverty: 12906×1 double

Values:

Min 3.4333
Median 12.178
Max 38.348
NumMissing 4

limited_english: 12906×1 double

Values:

Min 0
Median 2.7472
Max 26.755
NumMissing 4

commute_time: 12906×1 double

Values:

Min 12.461
Median 27.788
Max 48.02
NumMissing 1

health_uninsured: 12906×1 double

Values:

Min 2.44
Median 7.4657
Max 27.566
NumMissing 1

veteran: 12906×1 double

Values:

Min 1.2
Median 6.8471
Max 25.2
NumMissing 1

Ozone: 12906×1 double

Values:

Min 30.939
Median 39.108
Max 52.237
NumMissing 29

PM25: 12906×1 double

Values:

Min 2.636
Median 7.6866
Max 11.169
NumMissing 29

N02: 12906×1 double

Values:

Min 2.7604
Median 15.589
Max 31.505
NumMissing 29

DiagPeriodL90D: 12906×1 double

Values:

Min 0
Median 1
Max 1

Take a while to scroll via this abstract and see what data or patterns you possibly can study! Listed below are some issues I discover:
  1. There are a variety of rows or variables that simply say “cell array of character vectors”, which doesn’t inform us a lot concerning the information.
  2. There are just a few variables which have a excessive ‘NumMissing’ worth.
  3. The numeric variables can have dramatically totally different minimums and maximums.
We will use these observations to make selections about how we need to discover and preprocess the dataset.

Course of and Clear the Knowledge

1. Convert textual content information to categorical

Textual content information might be arduous for machine studying algorithms to know, so let’s undergo and alter every “cell array of character vectors” to a categorical. This may assist the algorithm type the textual content into totally different classes as a substitute of understanding it as a sequence of particular person letters.
varTypes = varfun(@class, allTrainData, OutputFormat=“cell”);
catIdx = strcmp(varTypes, “cell”);
varNames = allTrainData.Properties.VariableNames;
catVarNames = varNames(catIdx);
for catNameIdx = 1:size(catVarNames)
allTrainData.(catVarNames{catNameIdx}) = categorical(allTrainData.(catVarNames{catNameIdx}));
finish

2. Deal with Lacking Knowledge

Now I need to deal with all that lacking information I observed earlier. I’ll undergo every variable and particularly take a look at variables which might be lacking information for over half of the rows or observations.
dataSum = abstract(allTrainData);
for nameIdx = 1:size(varNames)
varName = varNames{nameIdx};
varNumMissing = dataSum.(varName).NumMissing;
if varNumMissing > (peak(allTrainData) / 2)
disp(varName);
disp(varNumMissing);
finish
finish
bmi
8965
metastatic_first_novel_treatment
12882
metastatic_first_novel_treatment_type
12882
Let’s take away these variables totally, since they won’t be too useful for our algorithm.
allTrainData = removevars(allTrainData, [“bmi”, “metastatic_first_novel_treatment”, “metastatic_first_novel_treatment_type”])
allTrainData = 12906×80 desk
patient_id patient_race payer_type patient_state patient_zip3 patient_age patient_gender breast_cancer_diagnosis_code breast_cancer_diagnosis_desc metastatic_cancer_diagnosis_code Area Division inhabitants density age_median age_under_10 age_10_to_19 age_20s age_30s age_40s age_50s age_60s age_70s age_over_80 male feminine married divorced never_married widowed
1 undefined <undefined> MEDICAID CA undefined undefined F C50919 Malignant neoplasm of unsp web site of unspecified feminine breast C7989 West Pacific undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
2 undefined White COMMERCIAL CA undefined undefined F C50411 Malig neoplm of upper-outer quadrant of proper feminine breast C773 West Pacific undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
3 undefined White COMMERCIAL TX undefined undefined F C50112 Malignant neoplasm of central portion of left feminine breast C773 South West South Central undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
4 undefined White COMMERCIAL CA undefined undefined F C50212 Malig neoplasm of upper-inner quadrant of left feminine breast C773 West Pacific undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
5 undefined <undefined> COMMERCIAL ID undefined undefined F undefined Malignant neoplasm of breast (feminine), unspecified C773 West Mountain undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
6 undefined White MEDICARE ADVANTAGE NY undefined undefined F undefined Malignant neoplasm of breast (feminine), unspecified C7981 Northeast Center Atlantic undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
7 undefined <undefined> COMMERCIAL CA undefined undefined F C50912 Malignant neoplasm of unspecified web site of left feminine breast C779 West Pacific undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
8 undefined White COMMERCIAL IL undefined undefined F C50512 Malig neoplasm of lower-outer quadrant of left feminine breast C773 Midwest East North Central undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
9 undefined White MEDICARE ADVANTAGE <undefined> undefined undefined F undefined Malignant neoplasm of upper-outer quadrant of feminine breast C7800 <undefined> <undefined> undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
10 undefined <undefined> COMMERCIAL IL undefined undefined F C50912 Malignant neoplasm of unspecified web site of left feminine breast C773 Midwest East North Central undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
11 undefined <undefined> MEDICARE ADVANTAGE MI undefined undefined F C50412 Malig neoplasm of upper-outer quadrant of left feminine breast C799 Midwest East North Central undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
12 undefined <undefined> MEDICARE ADVANTAGE CA undefined undefined F undefined Malignant neoplasm of breast (feminine), unspecified C7800 West Pacific undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
13 undefined White COMMERCIAL MI undefined undefined F C50812 Malignant neoplasm of ovrlp websites of left feminine breast C781 Midwest East North Central undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
14 undefined <undefined> <undefined> IL undefined undefined F undefined Malignant neoplasm of breast (feminine), unspecified C779 Midwest East North Central undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
Now I need to take a look at every row and take away any which might be lacking too many values. It’s okay to have a few lacking information factors in your dataset, however when you have too many it might trigger your machine studying algorithm to be much less correct. I’ll use the Clear Lacking Knowledge dwell job to take away any rows which might be lacking 2 or extra information factors.
% Take away lacking information
[fullData,missingIndices] = rmmissing(allTrainData,“MinNumMissing”,2);
% Show outcomes
determine
% Get places of lacking information
indicesForPlot = ismissing(allTrainData.patient_age);
masks = missingIndices & ~indicesForPlot;
% Plot cleaned information
plot(discover(~missingIndices),fullData.patient_age,“SeriesIndex”,1,“LineWidth”,1.5,
“DisplayName”,“Cleaned information”)
maintain on
% Plot information in rows the place different variables include lacking entries
plot(discover(masks),allTrainData.patient_age(masks),“x”,“SeriesIndex”,“none”,
“DisplayName”,“Eliminated by different variables”)
% Plot eliminated lacking entries
x = repelem(discover(indicesForPlot),3);
y = repmat([ylim(gca) missing]’,nnz(indicesForPlot),1);
plot(x,y,“Colour”,[145 145 145]/255,“DisplayName”,“Eliminated lacking entries”)
title(“Variety of eliminated lacking entries: ” + nnz(indicesForPlot))
maintain off
legend
ylabel(“patient_age”,“Interpreter”,“none”)
clear indicesForPlot masks x y

Discover the Knowledge

Now that the information is cleaned up, you need to spend a while exploring your information to know how totally different variables might work together with one another or see for those who can draw any significant conclusions from the information or work out which variables could also be roughly necessary in relation to predicting time to analysis.

Univariate Evaluation

First, I need to separate the information into two datasets: one stuffed with sufferers who have been recognized in 90 days or much less (the 1 or “True” values), and one stuffed with sufferers who weren’t (the 0 or “False” values). This may permit me to discover the information patterns in every of those datasets and search for any significant variations.
allTrueIdx = fullData.DiagPeriodL90D == 1;
allTrueData = fullData(allTrueIdx, 🙂
allTrueData = 7559×80 desk
patient_id patient_race payer_type patient_state patient_zip3 patient_age patient_gender breast_cancer_diagnosis_code breast_cancer_diagnosis_desc metastatic_cancer_diagnosis_code Area Division inhabitants density age_median age_under_10 age_10_to_19 age_20s age_30s age_40s age_50s age_60s age_70s age_over_80 male feminine married divorced never_married widowed
1 undefined <undefined> MEDICAID CA undefined undefined F C50919 Malignant neoplasm of unsp web site of unspecified feminine breast C7989 West Pacific undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
2 undefined White COMMERCIAL CA undefined undefined F C50411 Malig neoplm of upper-outer quadrant of proper feminine breast C773 West Pacific undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
3 undefined White COMMERCIAL TX undefined undefined F C50112 Malignant neoplasm of central portion of left feminine breast C773 South West South Central undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
4 undefined <undefined> COMMERCIAL CA undefined undefined F C50912 Malignant neoplasm of unspecified web site of left feminine breast C779 West Pacific undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
5 undefined White COMMERCIAL IL undefined undefined F C50512 Malig neoplasm of lower-outer quadrant of left feminine breast C773 Midwest East North Central undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
6 undefined <undefined> COMMERCIAL IL undefined undefined F C50912 Malignant neoplasm of unspecified web site of left feminine breast C773 Midwest East North Central undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
7 undefined White COMMERCIAL MI undefined undefined F C50812 Malignant neoplasm of ovrlp websites of left feminine breast C781 Midwest East North Central undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
8 undefined Different COMMERCIAL CA undefined undefined F C50911 Malignant neoplasm of unsp web site of proper feminine breast C773 West Pacific undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
9 undefined Hispanic MEDICARE ADVANTAGE IL undefined undefined F C50911 Malignant neoplasm of unsp web site of proper feminine breast C773 Midwest East North Central undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
10 undefined White COMMERCIAL MT undefined undefined F C50411 Malig neoplm of upper-outer quadrant of proper feminine breast C773 West Mountain undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
11 undefined White MEDICAID KY undefined undefined F C50312 Malig neoplasm of lower-inner quadrant of left feminine breast C773 South East South Central undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
12 undefined <undefined> MEDICARE ADVANTAGE CA undefined undefined F C50112 Malignant neoplasm of central portion of left feminine breast C773 West Pacific undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
13 undefined White COMMERCIAL OH undefined undefined F C50311 Malig neoplm of lower-inner quadrant of proper feminine breast C773 Midwest East North Central undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
14 undefined White <undefined> CO undefined undefined F C50911 Malignant neoplasm of unsp web site of proper feminine breast C7951 West Mountain undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
allFalseIdx = fullData.DiagPeriodL90D == 0;
allFalseData = fullData(allFalseIdx, 🙂
allFalseData = 4598×80 desk
patient_id patient_race payer_type patient_state patient_zip3 patient_age patient_gender breast_cancer_diagnosis_code breast_cancer_diagnosis_desc metastatic_cancer_diagnosis_code Area Division inhabitants density age_median age_under_10 age_10_to_19 age_20s age_30s age_40s age_50s age_60s age_70s age_over_80 male feminine married divorced never_married widowed
1 undefined White COMMERCIAL CA undefined undefined F C50212 Malig neoplasm of upper-inner quadrant of left feminine breast C773 West Pacific undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
2 undefined <undefined> COMMERCIAL ID undefined undefined F undefined Malignant neoplasm of breast (feminine), unspecified C773 West Mountain undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
3 undefined White MEDICARE ADVANTAGE NY undefined undefined F undefined Malignant neoplasm of breast (feminine), unspecified C7981 Northeast Center Atlantic undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
4 undefined <undefined> MEDICARE ADVANTAGE MI undefined undefined F C50412 Malig neoplasm of upper-outer quadrant of left feminine breast C799 Midwest East North Central undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
5 undefined <undefined> MEDICARE ADVANTAGE CA undefined undefined F undefined Malignant neoplasm of breast (feminine), unspecified C7800 West Pacific undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
6 undefined Different COMMERCIAL OR undefined undefined F C50411 Malig neoplm of upper-outer quadrant of proper feminine breast C786 West Pacific undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
7 undefined White MEDICARE ADVANTAGE NY undefined undefined F C50912 Malignant neoplasm of unspecified web site of left feminine breast C7801 Northeast Center Atlantic undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
8 undefined Asian COMMERCIAL MI undefined undefined F C50412 Malig neoplasm of upper-outer quadrant of left feminine breast C773 Midwest East North Central undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
9 undefined <undefined> COMMERCIAL NY undefined undefined F undefined Malignant neoplasm of upper-outer quadrant of feminine breast C7800 Northeast Center Atlantic undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
10 undefined <undefined> COMMERCIAL TX undefined undefined F undefined Malignant neoplasm of breast (feminine), unspecified C773 South West South Central undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
11 undefined <undefined> COMMERCIAL TX undefined undefined F C50912 Malignant neoplasm of unspecified web site of left feminine breast C773 South West South Central undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
12 undefined White MEDICARE ADVANTAGE IN undefined undefined F undefined Malignant neoplasm of breast (feminine), unspecified C7951 Midwest East North Central undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
13 undefined White MEDICARE ADVANTAGE FL undefined undefined F undefined Malignant neoplasm of breast (feminine), unspecified C7800 South South Atlantic undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
14 undefined Black COMMERCIAL VA undefined undefined F undefined Malignant neoplasm of central portion of feminine breast C7951 South South Atlantic undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
Now we will use the Create Plot dwell job to plot histograms of the totally different variables in every dataset. Within the plot beneath, blue bars symbolize information from the parents who have been recognized in a well timed method, and the pink bars symbolize information from the parents who weren’t.
determine
% Create histogram of chosen information
histogram(allTrueData.health_uninsured,“NumBins”,40,“DisplayName”,“health_uninsured”);
maintain on
% Create histogram of chosen information
histogram(allFalseData.health_uninsured,“NumBins”,40,“DisplayName”,“health_uninsured”);
maintain off
legend
Take a while to discover these visualizations by yourself, as I can solely present one by one on this weblog. It’s price noting that we’ve much less False information than True information, so the pink bars will virtually all the time be decrease than the blue bars. If there are pink bars which might be larger or if the shapes are totally different, that will point out a relationship between a variable and time to analysis.
I didn’t see many vital variations in form, although I did discover that for the ‘health_uninsured’ histograms the pink vars are pretty excessive within the larger numbers, indicating that there might be a correlation between populations with excessive charges of being unisured and time to analysis.

Bivariate and Multivariate Evaluation

You may break the information down additional and plot two (or extra!) variables towards one another to see if you could find any patterns. Within the plot beneath, for instance, we will see the share of the inhabitants that’s unisured and the state the affected person is in, damaged down by whether or not or not the affected person was recognized inside 90 days. Once more, blue values point out that the affected person was, and pink values point out that the affected person was not.
determine
% Create scatter of chosen information
scatter(allTrueData,“patient_state”,“health_uninsured”,“DisplayName”,“health_uninsured”);
maintain on
% Create scatter of chosen information
scatter(allFalseData,“patient_state”,“health_uninsured”,“DisplayName”,“health_uninsured”);
maintain off
legend
We will see that in some states, equivalent to GA, OK, or TX, the the pink values come from populations which might be usually larger by way of being uninsured. This might indcate that in some states, coming from a zipper code with a excessive inhabitants of uninsured people (or being uninsured your self) means you usually tend to obtain delays in your analysis.

Statistical Evaluation

You can even create significant deductions by calculating numerous statistics out of your information. For instance, I need to calculate the skewness, or degree of asymmetry, of every of my variables. A damaging worth signifies the information is left skewed when plotted, and a optimistic worth signifies the information is correct skewed when plotted, with a 0 which means the information is evenly distributed.
statsTrue = varfun(@skewness, allTrueData, “InputVariables”, @isnumeric);
statsFalse = varfun(@skewness, allFalseData, “InputVariables”, @isnumeric);
Now I need to see if any of the variables have a big distinction of their skewness, as variations within the information distributions between sufferers who have been recognized in a well timed method vs sufferers who weren’t might point out an underlying relationship between these variables and time to analysis.
statsDiffs = abs(statsTrue{:, :} – statsFalse{:, :});
statsTrue.Properties.VariableNames(statsDiffs > 0.2)
ans = 1×4 cell
‘skewness_density”skewness_age_over_80”skewness_rent_burden”skewness_race_native’
If we examine the 4 variables which might be returned, we will see that inhabitants density, the share of parents above 80 in your zip code, the median lease burden of your zip code, and the share of residents who reported their race as American Indian or Alaska Native in your zip code might have a relationship with time to analysis.

Characteristic Engineering

In terms of machine studying, you don’t have to make use of the entire information as it’s offered to you. Characteristic Engineering is the method of deciding what information you need to use, creating new information primarily based on the offered information, and remodeling the information to be in no matter format or vary is appropriate on your workflow. You are able to do this manually, and a number of the exploration we simply did ought to affect selections you make if you wish to mess around with together with or excluding totally different variables.
For this weblog, I’ll use the gencfeatures perform to automate this course of. I need to use 90 options, which is 10 greater than we at present have in our dataset, and it’ll undergo and create a set of 90 significant options primarily based on our processed dataset. It might maintain some information as-is, however will typically standardize numeric variables and create new variables by manipulating the offered information.
[T, augTrainData] = gencfeatures(fullData, “DiagPeriodL90D”, 90)
Warning: Desk variable names have been truncated to the size namelengthmax.
T =

FeatureTransformer with properties:

Sort: ‘classification’
TargetLearner: ‘linear’
NumEngineeredFeatures: 89
NumOriginalFeatures: 1
TotalNumFeatures: 90

augTrainData = 12157×91 desk
metastatic_cancer_diagnosis_code zsc(woe2(breast_cancer_diagnosis_code)) zsc(woe2(breast_cancer_diagnosis_desc)) zsc(woe2(metastatic_cancer_diagnosis_code)) zsc(woe2(patient_state)) zsc(patient_age./Ozone) zsc(patient_age./commute_time) zsc(kmc51) eb28(education_less_highschool) zsc(income_household_35_to_50./income_household_75_to_100) zsc(kmc12) eb11(patient_age) q28(income_household_under_5) zsc(rent_burden-education_less_highschool) q11(patient_age) zsc(sig(family_dual_income)) zsc(sig(patient_age)) zsc(sin(PM25)) zsc(cos(rent_median)) zsc(sin(patient_zip3)) zsc(health_uninsured./PM25) zsc(cos(inhabitants)) zsc(cos(education_bachelors)) zsc(sin(hispanic)) q28(density) eb28(education_highschool) zsc(income_household_75_to_100.*rent_burden) q28(unemployment_rate) q28(patient_zip3) zsc(patient_id.*hispanic)
1 C7989 undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
2 C773 undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
3 C773 undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
4 C773 undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
5 C773 undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
6 C7981 undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
7 C779 undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
8 C773 undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
9 C773 undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
10 C799 undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
11 C7800 undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
12 C781 undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
13 C773 undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
14 C786 undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
To higher perceive the generated options, you should use the describe perform of the returned FeatureTransformer object, ‘T’.
describe(T)
Sort IsOriginal InputVariables Transformations
___________ __________ _____________________________________________________ ________________________________________________________________________metastatic_cancer_diagnosis_code Categorical true metastatic_cancer_diagnosis_code
zsc(woe2(breast_cancer_diagnosis_code)) Numeric false breast_cancer_diagnosis_code Weight of Proof (optimistic class = 1)
Standardization with z-score (imply = -0.046637, std = 1.5098)
zsc(woe2(breast_cancer_diagnosis_desc)) Numeric false breast_cancer_diagnosis_desc Weight of Proof (optimistic class = 1)
Standardization with z-score (imply = -0.046637, std = 1.5098)
zsc(woe2(metastatic_cancer_diagnosis_code)) Numeric false metastatic_cancer_diagnosis_code Weight of Proof (optimistic class = 1)
Standardization with z-score (imply = 0.0067098, std = 0.28786)
zsc(woe2(patient_state)) Numeric false patient_state Weight of Proof (optimistic class = 1)
Standardization with z-score (imply = 0.0060064, std = 0.23323)
zsc(patient_age./Ozone) Numeric false patient_age, Ozone patient_age ./ Ozone
Standardization with z-score (imply = 1.5005, std = 0.36544)
zsc(patient_age./commute_time) Numeric false patient_age, commute_time patient_age ./ commute_time
Standardization with z-score (imply = 2.1895, std = 0.64638)
zsc(kmc51) Numeric false all legitimate numeric variables Centroid encoding (part #51) (kmeans clustering with ok = 10)
Standardization with z-score (imply = 5.9447, std = 0.1673)
eb28(education_less_highschool) Categorical false education_less_highschool Equal-width binning (variety of bins = 28)
zsc(income_household_35_to_50./income_household_75_to_100) Numeric false income_household_35_to_50, income_household_75_to_100 income_household_35_to_50 ./ income_household_75_to_100
Standardization with z-score (imply = 0.93234, std = 0.2685)
zsc(kmc12) Numeric false all legitimate numeric variables Centroid encoding (part #12) (kmeans clustering with ok = 10)
Standardization with z-score (imply = 13.4409, std = 0.15797)
eb11(patient_age) Categorical false patient_age Equal-width binning (variety of bins = 11)
q28(income_household_under_5) Categorical false income_household_under_5 Equiprobable binning (variety of bins = 28)
zsc(rent_burden-education_less_highschool) Numeric false rent_burden, education_less_highschool rent_burden – education_less_highschool
Standardization with z-score (imply = 19.3265, std = 5.7168)
q11(patient_age) Categorical false patient_age Equiprobable binning (variety of bins = 11)
zsc(sig(family_dual_income)) Numeric false family_dual_income sigmoid( )
Standardization with z-score (imply = 1, std = 4.2283e-11)
zsc(sig(patient_age)) Numeric false patient_age sigmoid( )
Standardization with z-score (imply = 1, std = 4.0863e-10)
zsc(sin(PM25)) Numeric false PM25 sin( )
Standardization with z-score (imply = 0.42558, std = 0.65419)
zsc(cos(rent_median)) Numeric false rent_median cos( )
Standardization with z-score (imply = 0.046444, std = 0.68827)
zsc(sin(patient_zip3)) Numeric false patient_zip3 sin( )
Standardization with z-score (imply = 0.054487, std = 0.70171)
zsc(health_uninsured./PM25) Numeric false health_uninsured, PM25 health_uninsured ./ PM25
Standardization with z-score (imply = 1.1917, std = 0.6234)
zsc(cos(inhabitants)) Numeric false inhabitants cos( )
Standardization with z-score (imply = -0.03209, std = 0.71354)
zsc(cos(education_bachelors)) Numeric false education_bachelors cos( )
Standardization with z-score (imply = 0.096871, std = 0.68966)
zsc(sin(hispanic)) Numeric false hispanic sin( )
Standardization with z-score (imply = 0.017785, std = 0.6817)
q28(density) Categorical false density Equiprobable binning (variety of bins = 28)
eb28(education_highschool) Categorical false education_highschool Equal-width binning (variety of bins = 28)
zsc(income_household_75_to_100.*rent_burden) Numeric false income_household_75_to_100, rent_burden income_household_75_to_100 .* rent_burden
Standardization with z-score (imply = 392.7502, std = 61.6458)
q28(unemployment_rate) Categorical false unemployment_rate Equiprobable binning (variety of bins = 28)
q28(patient_zip3) Categorical false patient_zip3 Equiprobable binning (variety of bins = 28)
zsc(patient_id.*hispanic) Numeric false patient_id, hispanic patient_id .* hispanic
Standardization with z-score (imply = 10169065.2502, std = 11587944.1233)
zsc(home_value.*race_other) Numeric false home_value, race_other home_value .* race_other
Standardization with z-score (imply = 2725364.3718, std = 4298818.8992)
zsc(patient_age.*income_household_20_to_25) Numeric false patient_age, income_household_20_to_25 patient_age .* income_household_20_to_25
Standardization with z-score (imply = 241.7171, std = 97.8001)
q25(farmer) Categorical false farmer Equiprobable binning (variety of bins = 25)
q27(race_native) Categorical false race_native Equiprobable binning (variety of bins = 27)
eb28(age_median) Categorical false age_median Equal-width binning (variety of bins = 28)
q28(never_married) Categorical false never_married Equiprobable binning (variety of bins = 28)
zsc(cos(patient_age)) Numeric false patient_age cos( )
Standardization with z-score (imply = 0.021113, std = 0.71469)
zsc(sin(race_black)) Numeric false race_black sin( )
Standardization with z-score (imply = 0.16517, std = 0.70668)
zsc(tanh(age_50s)) Numeric false age_50s tanh( )
Standardization with z-score (imply = 1, std = 8.9224e-09)
zsc(male+feminine) Numeric false male, feminine male + feminine
Standardization with z-score (imply = 100.0001, std = 0.000436)
q28(feminine) Categorical false feminine Equiprobable binning (variety of bins = 28)
eb28(male) Categorical false male Equal-width binning (variety of bins = 28)
zsc(sin(age_median)) Numeric false age_median sin( )
Standardization with z-score (imply = -0.1365, std = 0.71613)
q28(home_ownership) Categorical false home_ownership Equiprobable binning (variety of bins = 28)
zsc(age_over_80./income_household_20_to_25) Numeric false age_over_80, income_household_20_to_25 age_over_80 ./ income_household_20_to_25
Standardization with z-score (imply = 1.0866, std = 0.51568)
zsc(cos(education_highschool)) Numeric false education_highschool cos( )
Standardization with z-score (imply = -0.019221, std = 0.71994)
zsc(cos(race_black)) Numeric false race_black cos( )
Standardization with z-score (imply = -0.020693, std = 0.68773)
q28(self_employed) Categorical false self_employed Equiprobable binning (variety of bins = 28)
zsc(cos(age_median)) Numeric false age_median cos( )
Standardization with z-score (imply = -0.029038, std = 0.68394)
q50(patient_id) Categorical false patient_id Equiprobable binning (variety of bins = 50)
zsc(sin(race_asian)) Numeric false race_asian sin( )
Standardization with z-score (imply = 0.28421, std = 0.64235)
q28(education_stem_degree) Categorical false education_stem_degree Equiprobable binning (variety of bins = 28)
zsc(cos(age_20s)) Numeric false age_20s cos( )
Standardization with z-score (imply = 0.10518, std = 0.69162)
eb23(N02) Categorical false N02 Equal-width binning (variety of bins = 23)
q28(rent_burden) Categorical false rent_burden Equiprobable binning (variety of bins = 28)
zsc(race_asian.*veteran) Numeric false race_asian, veteran race_asian .* veteran
Standardization with z-score (imply = 28.4889, std = 30.7)
zsc(sin(income_household_35_to_50)) Numeric false income_household_35_to_50 sin( )
Standardization with z-score (imply = 0.03083, std = 0.68752)
zsc(cos(patient_zip3)) Numeric false patient_zip3 cos( )
Standardization with z-score (imply = -0.06867, std = 0.7071)
eb28(rent_burden) Categorical false rent_burden Equal-width binning (variety of bins = 28)
zsc(sig(rent_burden)) Numeric false rent_burden sigmoid( )
Standardization with z-score (imply = 1, std = 3.571e-10)
q28(age_over_80) Categorical false age_over_80 Equiprobable binning (variety of bins = 28)
q28(family_dual_income) Categorical false family_dual_income Equiprobable binning (variety of bins = 28)
q28(family_size) Categorical false family_size Equiprobable binning (variety of bins = 28)
zsc(age_over_80./income_household_5_to_10) Numeric false age_over_80, income_household_5_to_10 age_over_80 ./ income_household_5_to_10
Standardization with z-score (imply = 2.0422, std = 1.3415)
eb28(age_10_to_19) Categorical false age_10_to_19 Equal-width binning (variety of bins = 28)
q28(income_individual_median) Categorical false income_individual_median Equiprobable binning (variety of bins = 28)
zsc(age_over_80./unemployment_rate) Numeric false age_over_80, unemployment_rate age_over_80 ./ unemployment_rate
Standardization with z-score (imply = 0.74942, std = 0.37691)
zsc(cos(income_household_50_to_75)) Numeric false income_household_50_to_75 cos( )
Standardization with z-score (imply = -0.012865, std = 0.69717)
eb25(race_pacific) Categorical false race_pacific Equal-width binning (variety of bins = 25)
zsc(sin(patient_id)) Numeric false patient_id sin( )
Standardization with z-score (imply = -0.0018454, std = 0.70739)
zsc(race_native./race_multiple) Numeric false race_native, race_multiple race_native ./ race_multiple
Standardization with z-score (imply = 0.14079, std = 0.41944)
eb28(income_household_25_to_35) Categorical false income_household_25_to_35 Equal-width binning (variety of bins = 28)
zsc(age_50s-income_household_75_to_100) Numeric false age_50s, income_household_75_to_100 age_50s – income_household_75_to_100
Standardization with z-score (imply = 0.77657, std = 2.1264)
zsc(cos(age_60s)) Numeric false age_60s cos( )
Standardization with z-score (imply = 0.05337, std = 0.75178)
q28(income_household_35_to_50) Categorical false income_household_35_to_50 Equiprobable binning (variety of bins = 28)
eb21(race_black) Categorical false race_black Equal-width binning (variety of bins = 21)
zsc(sin(income_individual_median)) Numeric false income_individual_median sin( )
Standardization with z-score (imply = 0.045145, std = 0.69873)
q28(age_50s) Categorical false age_50s Equiprobable binning (variety of bins = 28)
q28(race_white) Categorical false race_white Equiprobable binning (variety of bins = 28)
q28(age_under_10) Categorical false age_under_10 Equiprobable binning (variety of bins = 28)
q28(disabled) Categorical false disabled Equiprobable binning (variety of bins = 28)
zsc(patient_age./income_household_100_to_150) Numeric false patient_age, income_household_100_to_150 patient_age ./ income_household_100_to_150
Standardization with z-score (imply = 3.9266, std = 1.314)
q28(income_household_75_to_100) Categorical false income_household_75_to_100 Equiprobable binning (variety of bins = 28)
zsc(sin(N02)) Numeric false N02 sin( )
Standardization with z-score (imply = 0.039533, std = 0.70149)
eb28(family_size) Categorical false family_size Equal-width binning (variety of bins = 28)
q28(limited_english) Categorical false limited_english Equiprobable binning (variety of bins = 28)
q28(income_household_100_to_150) Categorical false income_household_100_to_150 Equiprobable binning (variety of bins = 28)
zsc(farmer.*race_black) Numeric false farmer, race_black farmer .* race_black
Standardization with z-score (imply = 10.7649, std = 26.8957)
zsc(home_value.*race_pacific) Numeric false home_value, race_pacific home_value .* race_pacific
Standardization with z-score (imply = 59826.8413, std = 128896.4218)
zsc(education_graduate.*health_uninsured) Numeric false education_graduate, health_uninsured education_graduate .* health_uninsured
Standardization with z-score (imply = 97.7642, std = 54.0304)

Break up the Knowledge

The final step earlier than you possibly can practice a machine studying mannequin is to separate your information right into a coaching and testing set. We’ll use the coaching information to suit the mannequin, and the testing set to guage how properly the mannequin performs on new information earlier than we use it to make a submission. Right here I break up the information into 80% coaching and 20% testing.
numRows = peak(augTrainData);
[trainInd, ~, testInd] = dividerand(numRows, .8, 0, .2);
trainingData = augTrainData(trainInd, :);
testingData = augTrainData(testInd, :);

Prepare a Machine Studying Mannequin

On this instance, I’ll create a binary determination tree utilizing the fitctree perform and set ‘Optimize Hyperparameters’ to ‘auto’, which can try to attenuate the error of our algorithm by selecting one of the best worth for the ‘MinLeafSize’ parameter. It visualizes the outcomes of adjusting this worth, as might be seen beneath.
classificationTree = fitctree(trainingData, “DiagPeriodL90D”,
OptimizeHyperparameters=‘auto’);
|======================================================================================|
| Iter | Eval | Goal | Goal | BestSoFar | BestSoFar | MinLeafSize |
| | outcome | | runtime | (noticed) | (estim.) | |
|======================================================================================|
| 1 | Greatest | 0.18764 | 1.4699 | 0.18764 | 0.18764 | 1676 |
| 2 | Settle for | 0.18764 | 0.87349 | 0.18764 | 0.18764 | 162 |
| 3 | Settle for | 0.20923 | 1.005 | 0.18764 | 0.19426 | 36 |
| 4 | Settle for | 0.29395 | 1.6132 | 0.18764 | 0.18764 | 3 |
| 5 | Settle for | 0.18764 | 0.6073 | 0.18764 | 0.1876 | 491 |
| 6 | Settle for | 0.38012 | 0.21492 | 0.18764 | 0.24104 | 4858 |
| 7 | Settle for | 0.18764 | 0.60759 | 0.18764 | 0.18764 | 330 |
| 8 | Settle for | 0.18764 | 0.36986 | 0.18764 | 0.18763 | 1033 |
| 9 | Settle for | 0.19227 | 1.0609 | 0.18764 | 0.18762 | 80 |
| 10 | Settle for | 0.24409 | 1.4868 | 0.18764 | 0.18761 | 13 |
| 11 | Settle for | 0.18764 | 0.3479 | 0.18764 | 0.18568 | 1363 |
| 12 | Settle for | 0.18764 | 0.70426 | 0.18764 | 0.1861 | 231 |
| 13 | Settle for | 0.18764 | 0.48941 | 0.18764 | 0.18678 | 698 |
| 14 | Settle for | 0.29519 | 2.1238 | 0.18764 | 0.18671 | 1 |
| 15 | Settle for | 0.18764 | 0.35153 | 0.18764 | 0.18736 | 1438 |
| 16 | Settle for | 0.18764 | 0.86203 | 0.18764 | 0.18735 | 119 |
| 17 | Settle for | 0.18764 | 0.41595 | 0.18764 | 0.18734 | 849 |
| 18 | Settle for | 0.18764 | 0.31486 | 0.18764 | 0.18737 | 1527 |
| 19 | Settle for | 0.18764 | 0.60161 | 0.18764 | 0.18738 | 404 |
| 20 | Settle for | 0.18764 | 0.45615 | 0.18764 | 0.18738 | 589 |
|======================================================================================|
| Iter | Eval | Goal | Goal | BestSoFar | BestSoFar | MinLeafSize |
| | outcome | | runtime | (noticed) | (estim.) | |
|======================================================================================|
| 21 | Settle for | 0.18764 | 0.30864 | 0.18764 | 0.18745 | 1515 |
| 22 | Settle for | 0.18764 | 0.71981 | 0.18764 | 0.18745 | 138 |
| 23 | Settle for | 0.18764 | 0.62974 | 0.18764 | 0.18745 | 278 |
| 24 | Settle for | 0.18764 | 0.27013 | 0.18764 | 0.18749 | 1511 |
| 25 | Settle for | 0.18764 | 0.62894 | 0.18764 | 0.18749 | 196 |
| 26 | Settle for | 0.18764 | 0.40254 | 0.18764 | 0.18749 | 811 |
| 27 | Settle for | 0.30239 | 0.19617 | 0.18764 | 0.18741 | 2944 |
| 28 | Settle for | 0.18764 | 0.27176 | 0.18764 | 0.18741 | 1170 |
| 29 | Settle for | 0.18764 | 0.37273 | 0.18764 | 0.18747 | 1576 |
| 30 | Settle for | 0.18764 | 0.45381 | 0.18764 | 0.18747 | 945 |__________________________________________________________
Optimization accomplished.
MaxObjectiveEvaluations of 30 reached.
Complete perform evaluations: 30
Complete elapsed time: 50.9097 seconds
Complete goal perform analysis time: 20.2308Best noticed possible level:
MinLeafSize
___________1676Observed goal perform worth = 0.18764
Estimated goal perform worth = 0.18815
Perform analysis time = 1.4699Best estimated possible level (based on fashions):
MinLeafSize
___________1527Estimated goal perform worth = 0.18747
Estimated perform analysis time = 0.36237

I used a binary tree as my place to begin, nevertheless it’s necessary to check out various kinds of algorithms to see what works greatest along with your information! Take a look at the Classification Learner app documentation and this brief video to learn to practice a number of machine studying fashions shortly and iteratively!

Take a look at Your Mannequin

There are various methods to guage the efficiency of a machine studying mannequin, so on this weblog I’ll present how to take action by computing validation accuracy and utilizing testing information.

Validation Accuracy

Cross-validation is one technique of evaluating a mannequin, and at a excessive degree is completed by:
  1. Setting apart a subset of the coaching information, referred to as validation information
  2. Utilizing the remainder of the coaching information to suit the mannequin
  3. Testing how properly the mannequin performs on the validation information
You should use the crossval perform to do that:
% Carry out cross-validation
partitionedModel = crossval(classificationTree, ‘KFold’, 5);
Then, extract the misclassification fee, and subtract it from 1 to get the mannequin’s accuracy. The nearer to 1 this worth is, the extra correct our mannequin is.
% Compute validation accuracy
validationAccuracy = 1 – kfoldLoss(partitionedModel, LossFun=‘ClassifError’)
validationAccuracy = 0.8124

Testing Knowledge

On this part, we’ll use the ‘testingData’ dataset we created earlier. Just like what we did with the validation information, we will use the loss perform to compute the misclassification fee once you use the classification tree on the testing information, and subtract it from 1 to get a measure of accuracy.
testAccuracy = 1 – loss(classificationTree, testingData, “DiagPeriodL90D”,
LossFun=‘classiferror’)
testAccuracy = 0.8048
I additionally need to evaluate the predictions that the mannequin makes to the precise outputs, so let’s take away the ‘DiagPeriodL90D’ variable from our testing information
testActual = testingData.DiagPeriodL90D;
testingData = removevars(testingData, “DiagPeriodL90D”);
Now, use the mannequin to make predictions on the testing set
[testPreds, scores, ~, ~] = predict(classificationTree, testingData);
And use the confusionchart perform to match the anticipated outputs to the precise outputs, to see how typically they match or don’t.
confusionchart(testActual, testPreds)
This exhibits that it virtually all the time predicts 1s appropriately, or when the affected person is recognized inside 90 days, nevertheless it’s virtually a 50/50 likelihood that this mannequin will predict the 0s appropriately.
We will additionally use the check information and predictions to visualise receiver working attribute (ROC) metrics. The ROC curve exhibits the true optimistic fee (TPR) versus the false optimistic fee (FPR) for various thresholds of classification scores. The “Mannequin Working Level” exhibits the false optimistic fee and true optimistic fee of the mannequin.
rocObj = rocmetrics(testActual, scores, classificationTree.ClassNames);
plot(rocObj)
Right here we will see that the classifier appropriately assigns about 90-95% of the 1 class observations to 1 (TPR), however incorrectly assigns about 40% of the 0 class observations as 1 (FPR). That is just like what we noticed with the confusion chart.
You can even extract the world beneath the curve (AUC) worth, which is a measure of the general high quality of the classifier. The AUC values are within the vary 0 to 1, and bigger AUC values point out higher classifier efficiency.
rocObj.AUC
The AUC is fairly excessive, however exhibits that there’s undoubtedly room for enchancment. To study extra about ROC metrics, take a look at this documentation web page that explains it in additional element.

Create Submission

After getting a mannequin that performs properly on the validation and testing information, it’s time to create a submission for the datathon! As a reminder, you’ll add this file to Kaggle to be scored on the leaderboard.
First, import the ‘Take a look at’ dataset:
testDataFilename = ‘Take a look at.csv’;
allTestData = readtable(fullfile(dataFolder, testDataFilename))
allTestData = 3999×83 desk
patient_id patient_race payer_type patient_state patient_zip3 patient_age patient_gender bmi breast_cancer_diagnosis_code breast_cancer_diagnosis_desc metastatic_cancer_diagnosis_code metastatic_first_novel_treatment metastatic_first_novel_treatment_type Area Division inhabitants density age_median age_under_10 age_10_to_19 age_20s age_30s age_40s age_50s age_60s age_70s age_over_80 male feminine married
1 undefined ‘White’ ‘MEDICAID’ ‘IN’ undefined undefined ‘F’ NaN ‘C50412’ ‘Malig neoplasm of upper-outer quadrant of left feminine breast’ ‘C773’ NaN NaN ‘Midwest’ ‘East North Central’ undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
2 undefined ‘COMMERCIAL’ ‘FL’ undefined undefined ‘F’ NaN ‘C50912’ ‘Malignant neoplasm of unspecified web site of left feminine breast’ ‘C787’ NaN NaN ‘South’ ‘South Atlantic’ undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
3 undefined ‘Hispanic’ ‘MEDICAID’ ‘CA’ undefined undefined ‘F’ NaN ‘C50911’ ‘Malignant neoplasm of unsp web site of proper feminine breast’ ‘C773’ NaN NaN ‘West’ ‘Pacific’ undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
4 undefined ‘Hispanic’ ‘MEDICARE ADVANTAGE’ ‘CA’ undefined undefined ‘F’ NaN ‘C50912’ ‘Malignant neoplasm of unspecified web site of left feminine breast’ ‘C779’ NaN NaN ‘West’ ‘Pacific’ undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
5 undefined ‘Black’ ‘CA’ undefined undefined ‘F’ undefined ‘C50412’ ‘Malig neoplasm of upper-outer quadrant of left feminine breast’ ‘C779’ NaN NaN ‘West’ ‘Pacific’ undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
6 undefined ‘COMMERCIAL’ ‘MI’ undefined undefined ‘F’ undefined ‘1748’ ‘Malignant neoplasm of different specified websites of feminine breast’ ‘C7800’ NaN NaN ‘Midwest’ ‘East North Central’ undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
7 undefined ‘COMMERCIAL’ ‘TX’ undefined undefined ‘F’ NaN ‘C50912’ ‘Malignant neoplasm of unspecified web site of left feminine breast’ ‘C773’ NaN NaN ‘South’ ‘West South Central’ undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
8 undefined ‘White’ ‘MEDICARE ADVANTAGE’ ‘IN’ undefined undefined ‘F’ NaN ‘C50212’ ‘Malig neoplasm of upper-inner quadrant of left feminine breast’ ‘C773’ NaN NaN ‘Midwest’ ‘East North Central’ undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
9 undefined ‘COMMERCIAL’ ‘AZ’ undefined undefined ‘F’ NaN ‘C50919’ ‘Malignant neoplasm of unsp web site of unspecified feminine breast’ ‘C773’ NaN NaN ‘West’ ‘Mountain’ undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
10 undefined ‘COMMERCIAL’ ‘CA’ undefined undefined ‘F’ undefined ‘C50412’ ‘Malig neoplasm of upper-outer quadrant of left feminine breast’ ‘C7801’ NaN NaN ‘West’ ‘Pacific’ undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
11 undefined ‘White’ ‘MEDICAID’ ‘CO’ undefined undefined ‘F’ NaN ‘C50919’ ‘Malignant neoplasm of unsp web site of unspecified feminine breast’ ‘C7931’ NaN NaN ‘West’ ‘Mountain’ undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
12 undefined ‘White’ ‘MEDICAID’ ‘KY’ undefined undefined ‘F’ NaN ‘1749’ ‘Malignant neoplasm of breast (feminine), unspecified’ ‘C7931’ NaN NaN ‘South’ ‘East South Central’ undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
13 undefined ‘COMMERCIAL’ ‘IL’ undefined undefined ‘F’ NaN ‘1749’ ‘Malignant neoplasm of breast (feminine), unspecified’ ‘C7931’ NaN NaN ‘Midwest’ ‘East North Central’ undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
14 undefined ‘Asian’ ‘COMMERCIAL’ ‘IL’ undefined undefined ‘F’ undefined ‘C50912’ ‘Malignant neoplasm of unspecified web site of left feminine breast’ ‘C773’ NaN NaN ‘Midwest’ ‘East North Central’ undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined undefined
Then we have to course of this dataset in the identical means that we did the coaching information. On this part, I take advantage of code as a substitute of the dwell duties for simplicity.
% substitute cell arrays with categoricals
varTypes = varfun(@class, allTestData, OutputFormat=“cell”);
catIdx = strcmp(varTypes, “cell”);
varNames = allTestData.Properties.VariableNames;
catVarNames = varNames(catIdx);
for catNameIdx = 1:size(catVarNames)
allTestData.(catVarNames{catNameIdx}) = categorical(allTestData.(catVarNames{catNameIdx}));
finish
% take away variables with too many lacking information factors
allTestData = removevars(allTestData, [“bmi”, “metastatic_first_novel_treatment”, “metastatic_first_novel_treatment_type”]);
% take away rows with 2+ lacking factors
fullTestData = rmmissing(allTestData,“MinNumMissing”,2);
We additionally want to make use of the rework perform to create the identical options as we created utilizing gencfeatures for the coaching information.
augTestData = rework(T, fullTestData);
Now that the information is within the format our machine studying mannequin expects it to be in, use the predict perform to make predictions, and create a desk to include the affected person IDs and corresponding predictions.
submissionPreds = predict(classificationTree, augTestData);
submissionTable = desk(fullTestData.patient_id, submissionPreds, VariableNames=[“patient_id”, “DiagPeriodL90D”])
submissionTable = 3780×2 desk
patient_id DiagPeriodL90D
1 undefined undefined
2 undefined undefined
3 undefined undefined
4 undefined undefined
5 undefined undefined
6 undefined undefined
7 undefined undefined
8 undefined undefined
9 undefined undefined
10 undefined undefined
11 undefined undefined
12 undefined undefined
13 undefined undefined
14 undefined undefined
Final, export your predictions to a .CSV file, then add to Kaggle for scoring.
writetable(submissionTable, “Predictions.csv”);
And that’s it! Thanks for following together with this tutorial, and better of luck to all contributors. When you’ve got any questions on this tutorial or MATLAB, attain out to us at studentcompetitions@mathworks.com or by tagging gracewoolson within the discussion board. Preserve your eye out for our upcoming WiDS Workshop on January thirty first, the place we’ll stroll via this tutorial and reply any questions you’ve got alongside the way in which!

var css=”.embeddedOutputsVariableTableElement .ClientViewDiv desk tr { peak: 22px; white-space: nowrap;} .embeddedOutputsVariableTableElement .ClientViewDiv desk tr td,.embeddedOutputsVariableTableElement .ClientViewDiv desk tr th { background-color:white; text-overflow: ellipsis; font-family: Arial, sans-serif; font-size: 12px; overflow : hidden;} .embeddedOutputsVariableTableElement .ClientViewDiv desk tr span { text-overflow: ellipsis; padding: 3px;} .embeddedOutputsVariableTableElement .ClientViewDiv desk tr th { coloration: rgba(0,0,0,0.5); padding: 3px; font-size: 9px;} /* Styling that’s widespread to warnings and errors is in diagnosticOutput.css */.embeddedOutputsErrorElement { min-height: 18px; max-height: 550px;} .embeddedOutputsErrorElement .diagnosticMessage-errorType { overflow: auto;} .embeddedOutputsErrorElement.inlineElement {} .embeddedOutputsErrorElement.rightPaneElement {} /* Styling that’s widespread to warnings and errors is in diagnosticOutput.css */.embeddedOutputsWarningElement { min-height: 18px; max-height: 550px;} .embeddedOutputsWarningElement .diagnosticMessage-warningType { overflow: auto;} .embeddedOutputsWarningElement.inlineElement {} .embeddedOutputsWarningElement.rightPaneElement {} /* Copyright 2015-2019 The MathWorks, Inc. *//* On this file, kinds aren’t scoped to rtcContainer since they may very well be within the Dojo Tooltip */.diagnosticMessage-wrapper { font-family: Menlo, Monaco, Consolas, “Courier New”, monospace; font-size: 12px;} .diagnosticMessage-wrapper.diagnosticMessage-warningType { coloration: rgb(255,100,0);} .diagnosticMessage-wrapper.diagnosticMessage-warningType a { coloration: rgb(255,100,0); text-decoration: underline;} .diagnosticMessage-wrapper.diagnosticMessage-errorType { coloration: rgb(230,0,0);} .diagnosticMessage-wrapper.diagnosticMessage-errorType a { coloration: rgb(230,0,0); text-decoration: underline;} .diagnosticMessage-wrapper .diagnosticMessage-messagePart,.diagnosticMessage-wrapper .diagnosticMessage-causePart { white-space: pre-wrap;} .diagnosticMessage-wrapper .diagnosticMessage-stackPart { white-space: pre;} .embeddedOutputsTextElement,.embeddedOutputsVariableStringElement { white-space: pre; word-wrap: preliminary; min-height: 18px; max-height: 550px;} .embeddedOutputsTextElement .textElement,.embeddedOutputsVariableStringElement .textElement { overflow: auto;} .textElement,.rtcDataTipElement .textElement { padding-top: 2px;} .embeddedOutputsTextElement.inlineElement,.embeddedOutputsVariableStringElement.inlineElement {} .inlineElement .textElement {} .embeddedOutputsTextElement.rightPaneElement,.embeddedOutputsVariableStringElement.rightPaneElement { min-height: 16px;} .rightPaneElement .textElement { padding-top: 2px; padding-left: 9px;} .embeddedOutputsMatrixElement,.eoOutputWrapper .matrixElement { min-height: 18px; box-sizing: border-box;} .embeddedOutputsMatrixElement .matrixElement,.eoOutputWrapper .matrixElement,.rtcDataTipElement .matrixElement { place: relative;} .matrixElement .variableValue,.rtcDataTipElement .matrixElement .variableValue { white-space: pre; show: inline-block; vertical-align: high; overflow: hidden;} .embeddedOutputsMatrixElement.inlineElement {} .embeddedOutputsMatrixElement.inlineElement .topHeaderWrapper { show: none;} .embeddedOutputsMatrixElement.inlineElement .veTable .physique { padding-top: 0 !necessary; max-height: 100px;} .inlineElement .matrixElement { max-height: 300px;} .embeddedOutputsMatrixElement.rightPaneElement {} .rightPaneElement .matrixElement,.rtcDataTipElement .matrixElement { overflow: hidden; padding-left: 9px;} .rightPaneElement .matrixElement { margin-bottom: -1px;} .embeddedOutputsMatrixElement .matrixElement .valueContainer,.eoOutputWrapper .matrixElement .valueContainer,.rtcDataTipElement .matrixElement .valueContainer { white-space: nowrap; margin-bottom: 3px;} .embeddedOutputsMatrixElement .matrixElement .valueContainer .horizontalEllipsis.disguise,.embeddedOutputsMatrixElement .matrixElement .verticalEllipsis.disguise,.eoOutputWrapper .matrixElement .valueContainer .horizontalEllipsis.disguise,.eoOutputWrapper .matrixElement .verticalEllipsis.disguise,.rtcDataTipElement .matrixElement .valueContainer .horizontalEllipsis.disguise,.rtcDataTipElement .matrixElement .verticalEllipsis.disguise { show: none;} .embeddedOutputsVariableMatrixElement .matrixElement .valueContainer.hideEllipses .verticalEllipsis, .embeddedOutputsVariableMatrixElement .matrixElement .valueContainer.hideEllipses .horizontalEllipsis { show:none;} .embeddedOutputsMatrixElement .matrixElement .valueContainer .horizontalEllipsis,.eoOutputWrapper .matrixElement .valueContainer .horizontalEllipsis { margin-bottom: -3px;} .eoOutputWrapper .embeddedOutputsVariableMatrixElement .matrixElement .valueContainer { cursor: default !necessary;} .embeddedOutputsVariableElement { white-space: pre-wrap; word-wrap: break-word; min-height: 18px; max-height: 250px; overflow: auto;} .variableElement {} .embeddedOutputsVariableElement.inlineElement {} .inlineElement .variableElement {} .embeddedOutputsVariableElement.rightPaneElement { min-height: 16px;} .rightPaneElement .variableElement { padding-top: 2px; padding-left: 9px;} .outputsOnRight .embeddedOutputsVariableElement.rightPaneElement .eoOutputContent { /* Take away additional house allotted for navigation border */ margin-top: 0; margin-bottom: 0;} .variableNameElement { margin-bottom: 3px; show: inline-block;} /* * Ellipses as base64 for HTML export. */.matrixElement .horizontalEllipsis,.rtcDataTipElement .matrixElement .horizontalEllipsis { show: inline-block; margin-top: 3px; /* base64 encoded model of images-liveeditor/HEllipsis.png */ width: 30px; peak: 12px; background-repeat: no-repeat; background-image: url(“information:picture/png;base64,iVBORw0KGgoAAAANSUhEUgAAAB0AAAAJCAYAAADO1CeCAAAAJUlEQVR42mP4//8/A70xw0i29BUDFPxnAEtTW37wWDqakIa4pQDvOOG89lHX2gAAAABJRU5ErkJggg==”);} .matrixElement .verticalEllipsis,.textElement .verticalEllipsis,.rtcDataTipElement .matrixElement .verticalEllipsis,.rtcDataTipElement .textElement .verticalEllipsis { margin-left: 35px; /* base64 encoded model of images-liveeditor/VEllipsis.png */ width: 12px; peak: 30px; background-repeat: no-repeat; background-image: url(“information:picture/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAoAAAAZCAYAAAAIcL+IAAAALklEQVR42mP4//8/AzGYgWyFMECMwv8QddRS+P//KyimlmcGUOFoOI6GI/UVAgDnd8Dd4+NCwgAAAABJRU5ErkJggg==”);}”; var head = doc.head || doc.getElementsByTagName(‘head’)[0], model = doc.createElement(‘model’); head.appendChild(model); model.kind=”textual content/css”; if (model.styleSheet){ model.styleSheet.cssText = css; } else { model.appendChild(doc.createTextNode(css)); }

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Recent Comments